I was playing around with Tensorflow for image classification. I used the image_retraining/retrain.py to retrain the inception library with new categories and used it to classify images using label_image.py from https://github.com/llSourcell/tensorflow_image_classifier/blob/master/src/label_image.py as below:
import tensorflow as tf
import sys
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
#predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data})
predictions = sess.run(softmax_tensor,{'DecodePng/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
I noticed two issues. When I retrain with new categories, it only trains JPG images. I am a noob in machine learning so not sure whether this is a limitation or is it possible to train other extension images like PNG, GIF?
Another one is when classifying the images the input is again only for JPG. I tried to change DecodeJpeg to DecodePng in label_image.py above but couldn't work. Another way I tried was to convert other formats into JPG before passing them in for classification like:
im = Image.open('/root/Desktop/200_s.gif').convert('RGB')
im.save('/root/Desktop/test.jpg', "JPEG")
image_path1 = '/root/Desktop/test.jpg'
Is there any other way to do this? Does Tensorflow have functions to handle other image formats other than JPG?
I tried the following by feeding in parsed image as compared to JPEG as suggested by #mrry
import tensorflow as tf
import sys
import numpy as np
from PIL import Image
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
image = Image.open(image_path)
image_array = np.array(image)[:,:,0:3] # Select RGB channels only.
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor,{'DecodeJpeg:0': image_array})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
It works for JPEG images but when I use PNG or GIF it throws
Traceback (most recent call last):
File "label_image.py", line 17, in <module>
image_array = np.array(image)[:,:,0:3] # Select RGB channels only.
IndexError: too many indices for array
The model can only train on (and evaluate) JPEG images, because the GraphDef that you've saved in /root/tf_files/output_graph.pb only contains a tf.image.decode_jpeg() op, and uses the output of that op for making predictions. There are at least a couple of options for using other image formats:
Feed in parsed images rather than JPEG data. In the current program, you feed in a JPEG-encoded image as a string value for the tensor "DecodeJpeg/contents:0". Instead, you can feed in a 3-D array of decoded image data for the tensor "DecodeJpeg:0" (which represents the output of the tf.image.decode_jpeg() op), and you can use NumPy, PIL, or some other Python library to create this array.
Remap the image input in tf.import_graph_def(). The tf.import_graph_def() function enables you to connect two different graphs together by remapping individual tensor values. For example, you could do something like the following to add a new image-processing op to the existing graph:
image_string_input = tf.placeholder(tf.string)
image_decoded = tf.image.decode_png(image_string_input)
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
softmax_tensor, = tf.import_graph_def(
graph_def,
input_map={"DecodeJpeg:0": image_decoded},
return_operations=["final_result:0"])
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
predictions = sess.run(softmax_tensor, {image_string_input: image_data})
# ...
You should have a look at the tf.image package. It's got good functions to decode / encode JPEGs, GIFs and PNGs.
Following #mrry's suggestion to feed in parsed image, converted the image data into array and convert into RGB as stated below in the code. Now I am able to feed in JPG,PNG and GIF.
import tensorflow as tf
import sys
import numpy as np
from PIL import Image
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
image = Image.open(image_path)
image_array = image.convert('RGB')
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor,{'DecodeJpeg:0': image_array})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
Related
I have already converted a pre-trained .ckpt file to .pb file freezing the model and saving the weighs as well. What I am trying to do now is to make a simple inference using that .pb file and extract and save output image. The model is a (Fully Convolutional Network for Semantic Segmentation) downloaded from here : https://github.com/MarvinTeichmann/KittiSeg . So far I have managed to, load the image, set the default tf graph and import the graph defined by the model on that, read the input and the output tensors and run the session (error here).
import tensorflow as tf
import os
import numpy as np
from tensorflow.python.platform import gfile
from PIL import Image
# Read the image & get statstics
img=Image.open('/path-to-image/demoImage.png')
img.show()
width, height = img.size
print(width)
print(height)
#Plot the image
#image.show()
with tf.Graph().as_default() as graph:
with tf.Session() as sess:
# Load the graph in graph_def
print("load graph")
# We load the protobuf file from the disk and parse it to retrive the unserialized graph_drf
with gfile.FastGFile("/path-to-FCN-model/FCN8.pb",'rb') as f:
#Set default graph as current graph
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
#sess.graph.as_default() #new line
# Import a graph_def into the current default Graph
tf.import_graph_def(graph_def, name='')
# Print the name of operations in the session
#for op in sess.graph.get_operations():
#print "Operation Name :",op.name # Operation name
#print "Tensor Stats :",str(op.values()) # Tensor name
# INFERENCE Here
l_input = graph.get_tensor_by_name('Placeholder:0')
l_output = graph.get_tensor_by_name('save/Assign_38:0')
print "l_input", l_input
print "l_output", l_output
print
print
# Acceptable feed values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles.
result = sess.run(l_output, feed_dict={l_input : img})
print(results)
print("Inference done")
# Info
# First Tensor name : Placeholder:0
# Last tensor name : save/Assign_38:0"
Can the error come from the format of the image (e.g should I convert .png to another format?). Is it another fundamental error?
I managed to fix the error, below is the working script to inference a single image on Fully Convolutional Networks (for whoever is interesting in an alternative segmentation algorithm from SEGNET) . This model use billinear interpolation for scaling rather than an un-pooling layer. Anyway, because the model is available to download in a .chkpt format, you must first freeze the model and save it as a .pb file. Later on, you must pass the network from TF optimizer to set Dropout probabilities to 1. Afterwards, set the correct input and output tensor name in this script and the inference works correctly, extracting the segmented image.
import tensorflow as tf # Default graph is initialized when the library is imported
import os
from tensorflow.python.platform import gfile
from PIL import Image
import numpy as np
import scipy
from scipy import misc
import matplotlib.pyplot as plt
import cv2
with tf.Graph().as_default() as graph: # Set default graph as graph
with tf.Session() as sess:
# Load the graph in graph_def
print("load graph")
# We load the protobuf file from the disk and parse it to retrive the unserialized graph_drf
with gfile.FastGFile("/path-to-protobuf/FCN8_Freezed.pb",'rb') as f:
print("Load Image...")
# Read the image & get statstics
image = scipy.misc.imread('/Path-To-Image/uu_000010.png')
image = image.astype(float)
Input_image_shape=image.shape
height,width,channels = Input_image_shape
print("Plot image...")
#scipy.misc.imshow(image)
# Set FCN graph to the default graph
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
# Import a graph_def into the current default Graph (In this case, the weights are (typically) embedded in the graph)
tf.import_graph_def(
graph_def,
input_map=None,
return_elements=None,
name="",
op_dict=None,
producer_op_list=None
)
# Print the name of operations in the session
for op in graph.get_operations():
print "Operation Name :",op.name # Operation name
print "Tensor Stats :",str(op.values()) # Tensor name
# INFERENCE Here
l_input = graph.get_tensor_by_name('Inputs/fifo_queue_Dequeue:0') # Input Tensor
l_output = graph.get_tensor_by_name('upscore32/conv2d_transpose:0') # Output Tensor
print "Shape of input : ", tf.shape(l_input)
#initialize_all_variables
tf.global_variables_initializer()
# Run Kitty model on single image
Session_out = sess.run( l_output, feed_dict = {l_input : image}
Have you already looked at the demo.py. There is shown at line 141 how they modify the input of the graph:
# Create placeholder for input
image_pl = tf.placeholder(tf.float32)
image = tf.expand_dims(image_pl, 0)
# build Tensorflow graph using the model from logdir
prediction = core.build_inference_graph(hypes, modules,
image=image)
And at line 164 how the image is opened:
image = scp.misc.imread(input_image)
Which is fed directly to image_pl. The only point is that core.build_inference_graph is a TensorVision call.
Note, it would be interesting to provide the exact error message as as input as well.
I have retrained two different classification models models using retrain.py.
For predicting labels for two images I have created getLabel method from Label_image.py as follows:
def getLabel(localFile, graphKey, labelKey):
image_data_str = tf.gfile.FastGFile(localFile, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile(labelKey)]
# Unpersists graph from file
with tf.gfile.FastGFile(graphKey, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
sess = tf.Session()
with sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': image_data_str})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
series = []
count = 1
for node_id in top_k:
human_string = label_lines[node_id]
if count==1:
label = human_string
count+=1
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
series.append({"name": human_string, "data": [score * 100]})
sess.close()
return label, series
And I am calling them as
label,series = predict.getLabel(localFile, 'graph1.pb', 'labels1.txt')
label,series = predict.getLabel(localFile, 'graph2.pb', 'labels2.txt')
But for the second function call it is using the old graph i.e. graph1.pb & it is giving below error since model 1 has more categories than model 2.
human_string = label_lines[node_id]
IndexError: list index out of range
I am not able to understand why is this happening. Can someone tell how to load second graph??
It looks like what is happening is that you are calling the same session for both calls to predict.getFinalLabel. What you should do is define two separate sessions, and initialize each separately (e.g. have predict1.getFinalLabel and predict2.getFinalLabel). If you post more of your code, I can provide more detail and code.
I am writing a Python program to detect the state of a chess board and I am using a sliding window to detect the position of each piece. My main program detects the chessboard within an image and passes its cropped picture to the my_sliding_window method. This is supposed to use Tensorflow to detect a piece in the sliding window. From this tutorial I saw that pictures are read like this:
image_data = tf.gfile.FastGFile('picture.jpg', 'rb').read()
But I don't want to read it from file as I already have the picture in a numpy array. How do make my numpy array in such a way that it can be classified by Tensorflow?
Thank you.
Code:
import tensorflow as tf, sys
import cv2
image_path = sys.argv[1]
img = cv2.imread('picture.jpg')
image_data = tf.convert_to_tensor(img)
print type(image_data) # this returns <class 'tensorflow.python.framework.ops.Tensor'>
# This is what is used in the tutorial I mentioned above
image_data2 = tf.gfile.FastGFile(image_path, 'rb').read()
print type(image_data2) # this returns <type 'str'>
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("retrained_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("retrained_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
You could use tf.convert_to_tensor() to convert your numpy array into a TensorFlow tensor:
This function converts Python objects of various types to Tensor objects. It accepts Tensor objects, numpy arrays, Python lists, and Python scalars.
Update
Ok, so what you're trying to do is to feed the numpy array image_data, with dimensions [123, 82] to the placeholder DecodeJpeg/contents:0. However that placeholder was defined with shape=() meaning it only accepts 0D tensors as input (see tensor shapes), hence throwing you an error.
What the original code does is to read an image as a dimensionless string with:
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
which is then fed to the DecodeJpeg/contents:0 placeholder in:
predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': image_data})
The easiest way to proceed and try to run your images through the pretrained graph would be to use the same tf.gfile.FastGFile() call for loading the images.
I solved the problem by using this one-liner:
image_data = cv2.imencode('.jpg', cv_image)[1].tostring()
I went through the Tensorflow for poets Tutorial and then classified my own images, this is all done in the TF docker container provided. The model has a validation accuracy in the mid to low 90's. There is a separate file that makes predictions for new images(below).
I copied the files 'retrained_labels_corn.txt' and 'retrained_graph_corn.pd' and the file holding the code seen below to a directory(and changed the file paths) to see if I could make predictions while not in the docker container. I made sure I give it a valid image path as the system arg but it always predicts the image as one class with a probability above 97%. When I do the same thing while in the docker container everything works fine. I even tried pointing the labeling file to the exact same files that docker container uses and I am getting the same result of it always predicting one class with a high degree of certainty.
What did I do wrong?
import tensorflow as tf, sys
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/tf_files/retrained_labels_corn.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/tf_files/retrained_graph_corn.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor, \
{'DecodeJpeg/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
I am on Ubuntu version 16.04 and TF 0.10
I am using the Tensorflow image classification example (https://www.tensorflow.org/versions/r0.9/tutorials/image_recognition/index.html).
How could I classify multiple images at a time?
EDIT: Ideally, I would just pass in one image and a number (nb) as arguments, and then make the input-to-be-classified nb iterations of that image
The file is classify_image.py, and the important portion is:
def run_inference_on_image(image):
"""Runs inference on an image.
Args:
image: Image file name.
Returns:
Nothing
"""
if not tf.gfile.Exists(image):
tf.logging.fatal('File does not exist %s', image)
image_data = tf.gfile.FastGFile(image, 'rb').read()
# Creates graph from saved GraphDef.
create_graph()
with tf.Session() as sess:
# Some useful tensors:
# 'softmax:0': A tensor containing the normalized prediction across
# 1000 labels.
# 'pool_3:0': A tensor containing the next-to-last layer containing 2048
# float description of the image.
# 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG
# encoding of the image.
# Runs the softmax tensor by feeding the image_data as input to the graph.
softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
predictions = sess.run(softmax_tensor,
{'DecodeJpeg/contents:0': image_data})
predictions = np.squeeze(predictions)
# Creates node ID --> English string lookup.
node_lookup = NodeLookup()
top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
for node_id in top_k:
human_string = node_lookup.id_to_string(node_id)
score = predictions[node_id]
print('%s (score = %.5f)' % (human_string, score))
def main(_):
maybe_download_and_extract()
image = (FLAGS.image_file if FLAGS.image_file else
os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
run_inference_on_image(image)
The code relevant to you would be this section:
def main(_):
maybe_download_and_extract()
image = (FLAGS.image_file if FLAGS.image_file else
os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
run_inference_on_image(image)
In order to have predictions for all the png, jpeg or jpg files in a "images" folder, you could do this:
def main(_):
maybe_download_and_extract()
# search for files in 'images' dir
files_dir = os.getcwd() + '/images'
files = os.listdir(files_dir)
# loop over files, print prediction if it is an image
for f in files:
if f.lower().endswith(('.png', '.jpg', '.jpeg')):
image_path = files_dir + '/' + f
print run_inference_on_image(image_path)
This should print out the predictions for all your images in that folder