I'm trying to extract predictions, use predictions in calculating accuracy/precision/recall/F1 and prediction probability. I know I have 10 output classes therefore I can't calculate precision per see but I will be doing all these in other models moreover I'd like to be able to extract prediction probabilities. My model is as follows. I've checked GitHub and StackOverflow however I have yet to find a way to extract those properties. Most of the answers come close but never answer what I needed. I've used some low epoch numbers there in order to check out model fast and keep the output screen less crowded.
import tensorflow as tf
from tensorflow.contrib.layers import fully_connected
from sklearn.datasets import fetch_mldata
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
mnist = fetch_mldata('MNIST original', data_home="data/mnist/")
lb = LabelBinarizer().fit(mnist.target)
X_train, X_test, y_train, y_test = train_test_split(mnist.data, lb.transform(mnist.target), train_size=0.9, test_size=0.1)
X = tf.placeholder(tf.float32, shape=(None, 784))
y = tf.placeholder(tf.int64, shape=(None, 10))
lOne = fully_connected(inputs=X, num_outputs=100, activation_fn=tf.nn.elu)
logits = fully_connected(inputs=lOne, num_outputs=10, activation_fn=tf.nn.softmax)
pred = logits
acc = tf.metrics.accuracy(labels=y, predictions=pred)
loss = tf.losses.softmax_cross_entropy(logits=logits, onehot_labels=y)
trainOP = tf.train.AdamOptimizer(0.001).minimize(loss)
import numpy as np
bSize = 100
batches = int(np.floor(X_train.shape[0]/bSize)+1)
def batcher(dSet, bNum):
return(dSet[bSize*(bNum-1):bSize*(bNum)])
epochs = 2
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(0, epochs):
for batch in range(1, batches):
X_batch = batcher(X_train, batch)
y_batch = batcher(y_train, batch)
sess.run(trainOP, feed_dict={X: X_batch, y: y_batch})
lossVal = sess.run([loss], feed_dict={X: X_test, y: y_test})
print(lossVal)
sess.close()
The code shared in the question covers training, but not "using" (infering) with the resulting model.
Two issues:
The trained model is not serialized, so future runs will run on an untrained model, and predict whatever their initialization tells them to. Hence a question comment suggesting to save the trained model, and restore it when predicting.
The logits are the output of a SoftMax function. A common way to get a class from logits is to select the highest value in the tensor (here a vector).
With TensorFlow, the last point can be done with tf.argmax ("Returns the index with the largest value across axes of a tensor."):
tf.argmax(input=logits, axis=1)
All in all, the question's code covers only partially the MNIST tutorial from the TensorFlow team. Perhaps more pointers there if you get stuck with this code.
I'm writing in case anyone may stumble upon this particular case. I've built a network following basic MNIST examples, I've used tf.nn.softmax in the final layer and expected to get results from said layer. It looks like I need to use softmax function again to get the results from a layer such as yPred = tf.nn.softmax(logits) with logits being the name of the output layer. I'm adding fixed code below.
I can add a line to save the model, load it later on and made predictions on saved model. Since this is just an example for me building the model, I've omitted the saving part.
import tensorflow as tf
from tensorflow.contrib.layers import fully_connected
from sklearn.datasets import fetch_mldata
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
mnist = fetch_mldata('MNIST original', data_home="data/mnist/")
lb = LabelBinarizer().fit(mnist.target)
X_train, X_test, y_train, y_test = train_test_split(mnist.data, lb.transform(mnist.target), train_size=0.9, test_size=0.1, stratify = mnist.target, random_state=42)
X = tf.placeholder(tf.float32, shape=(None, 784))
y = tf.placeholder(tf.int64, shape=(None, 10))
lOne = fully_connected(inputs=X, num_outputs=100, activation_fn=tf.nn.elu)
lTwo = fully_connected(inputs=lOne, num_outputs=100, activation_fn=tf.nn.elu)
logits = fully_connected(inputs=lTwo, num_outputs=10, activation_fn=tf.nn.softmax)
pred = tf.nn.softmax(logits)
acc_bool = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
acc_Num = tf.cast(acc_bool, tf.float32)
acc_Mean = tf.reduce_mean(acc_Num)
loss = tf.losses.softmax_cross_entropy(logits=logits, onehot_labels=y)
trainOP = tf.train.AdamOptimizer(0.001).minimize(loss)
import numpy as np
bSize = 1024
batches = int(np.floor(X_train.shape[0]/bSize)+1)
def batcher(dSet, bNum):
return(dSet[bSize*(bNum-1):bSize*(bNum)])
epochs = 250
init = tf.global_variables_initializer()
trainA = []
testA = []
with tf.Session() as sess:
sess.run(init)
for epoch in range(0, epochs):
for batch in range(1, batches):
X_batch = batcher(X_train, batch)
y_batch = batcher(y_train, batch)
sess.run(trainOP, feed_dict={X: X_batch, y: y_batch})
if epoch % 25 == 1:
trainLoss, trainAcc = sess.run([loss, acc_Mean], feed_dict={X: X_train, y: y_train})
testLoss, testAcc = sess.run([loss, acc_Mean], feed_dict={X: X_test, y: y_test})
yPred = sess.run(pred, feed_dict={X: X_test[0].reshape(1,-1), y: y_test[0].reshape(1,-1)})
print(yPred)
sess.close()
Related
I am training a keras model to perform some simple categorisation tasks. In my case, the model needs to learn how to make a judgement based on the task cue and a given task stimulus. The task stimulus is an array of 5 numbers, which is randomly generated. Here, to train the model, I don't need the epoch more than 1, as learning a specific stimulus is not the goal. Thus, I set up the epoch as 1. However, I don't want the model to have a 100% accuracy on the validation dataset, but 80%.
To achieve this goal, I used callback function to stop training when the training accuracy reached 80%. But then I found the accuracy on the validation dataset is much better than the training accuracysee here. As I only have one epoch here, how should I setup the callback function to make sure the model has 80% accuracy on the validation dataset? Thanks in advance!
Here are codes:
import numpy as np
import random
import tensorflow as tf
from tensorflow import keras
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
# random seed
from numpy.random import seed
seed(1234)
random.seed(10)
# set up a simple categorization task
def tasksets(train_num, test_num, task_num):
x_train, y_train, rules_train = train_sequence(train_num, task_num)
x_test, y_test, rules_test = train_sequence(test_num, task_num)
return x_train, x_test, rules_train, y_train, y_test, rules_test
# generating training sequence
def train_sequence(trial_num, task_num):
x = np.zeros((trial_num, task_num), dtype=np.float64)
y = np.zeros(trial_num, dtype=np.float64)
rules = np.zeros(trial_num, dtype=np.float64)
rulepool = []
for r in range(task_num):
rulepool = rulepool + [r]*int(trial_num/task_num)
random.shuffle(rulepool)
for i in range(trial_num):
for t in range(task_num):
x[i,t] = random.random() # multi-dimentional stimuli
rule_idx = rulepool.pop(random.randint(0, len(rulepool)-1))
rules[i] = rule_idx
if x[i, rule_idx] <= 0.5:
answer = 0 # no
elif x[i, rule_idx] > 0.5:
answer = 1 # yes
y[i] = answer
x = np.reshape(x, (trial_num,task_num))
y = tf.one_hot(y, 2)
rules = np.reshape(rules, (trial_num))
rules = tf.one_hot(rules, depth = task_num)
return x, y, rules
def build_network(task_num, learning_rate: float = 0.001):
# nsteps = 1
input_dims = task_num
inputs_stimuli = Input(shape=(input_dims), name = "stimulus")
inputs_rule = Input(shape=(input_dims), name = "rule")
hid1 = Dense(6, activation='relu', name = "stimulus_representation")(inputs_stimuli)
hid2 = Dense(6, activation='relu', name = "rule_representation")(inputs_rule) # back to dense
fuse = keras.layers.concatenate([hid1, hid2]) # combine multiple inputs
decision = Dense(100, activation = "relu", name = "decision")(fuse) # back to dense
output = Dense(2, activation="softmax", name = "output")(decision)
model = Model(inputs=[inputs_stimuli, inputs_rule], outputs=output)
loss = tf.keras.losses.CategoricalCrossentropy()
model.compile(optimizer = \
tf.keras.optimizers.Adam(learning_rate = learning_rate), loss = loss, metrics = ["accuracy"])
return model
# Instantiate a callback object
accuracy_threshold = 0.80
class myCallback(tf.keras.callbacks.Callback):
def on_train_batch_end(self, batch, logs={}):
keys = list(logs.keys())
print("...Training: start of batch {}; got log keys: {}".format(batch, keys))
if(logs.get('accuracy') > accuracy_threshold):
print("\nReached %2.2f%% accuracy, so stopping training!!" %(accuracy_threshold*100))
self.model.stop_training = True
callbacks = myCallback()
# the model is trained until it reaches 80% accuracy in the test
check = 0
test_threshold = 1
task_num = 5
batch_size = 8
model = build_network(task_num)
x_train, x_test, rule_train, y_train, y_test, rule_test = tasksets(10000*task_num, 100*task_num, task_num)
results = model.fit([x_train, rule_train], y_train, epochs = 1,
batch_size = batch_size, callbacks=callbacks,
validation_data = ([x_test, rule_test], y_test))
I came across a weird difference between keras model.fit() and sklearn model.fit() functions. When model.fit() is called inside a loop I get inconsistent predictions using a Keras sequential model. This is not the case when using an sklearn model. See sample code to reproduce the phenomenon.
from numpy.random import seed
seed(1337)
import tensorflow as tf
tf.random.set_seed(1337)
from sklearn.linear_model import LogisticRegression
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import InputLayer
from sklearn.datasets import make_blobs
from sklearn.preprocessing import MinMaxScaler
import numpy as np
def get_sequential_dnn(NUM_COLS, NUM_ROWS):
# code for model
if __name__ == "__main__":
input_size = 10
X, y = make_blobs(n_samples=100, centers=2, n_features=input_size,
random_state=1
)
scalar = MinMaxScaler()
scalar.fit(X)
X = scalar.transform(X)
model = get_sequential_dnn(X.shape[1], X.shape[0])
# print(model.summary())
# model = LogisticRegression()
for i in range(2):
model.fit(X, y, epochs=100, verbose=0, shuffle=False)
# model.fit(X, y)
Xnew, _ = make_blobs(n_samples=3, centers=2, n_features=10, random_state=1)
Xnew = scalar.transform(Xnew)
# make a prediction
# ynew = model.predict_proba(Xnew)[:, 1]
ynew = model.predict_proba(Xnew)
ynew = np.array(ynew)
# show the inputs and predicted outputs
print('--------------')
for i in range(len(Xnew)):
print("X=%s \n Predicted=%s" % (Xnew[i], ynew[i]))
The output of this is
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=[0.9931685]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=[0.35249507]
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=[0.35249507]
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=[1.]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=[0.17942095]
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=[0.17942095]
While if I use a Logistic Regression (un-comment the commented lines) the predictions are consistent:
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=0.038716408758471876
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=0.038716408758471876
I get that the obvious solution to this is fit the model before the loop, probably there is a strong randomness how Keras models fit the data to the labels, but there are a couple of cases where you need to have a loop to get prediction scores. For example if you want to perform a 10-fold cross validation to get the AUC, sensitivity, specificity values on a training data. In these situations this randomness is unacceptable.
What is causing this inconsistency and what is the solution to it?
There are couple of issue with the way your are trying to make reproducible results with keras.
You are calling the fit (when i==1) over the already fitted model (when i==0). So the optimizer sees different sets of inital weights in both the cases and so you will end up in two different models. Solution: Get a fresh model everytime. This is not the case with sklearn, which starts with fresh initialized weights every time a fit is called.
model.fit internally might use a current stage of random number generator. You seeded it outside the loop, so the state will be different when fit is called the second time. Solution: Seed inside the loop.
Sample code with issue
# Issue 2 here
tf.random.set_seed(1337)
def get_model():
model = Sequential()
model.add(Dense(4, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
return model
X = np.random.randn(10,8)
y = np.random.randn(10,1)
# Issue 1 here
model = get_model()
results = []
for i in range(10):
model.fit(X, y, epochs=5, verbose=0, shuffle=False)
results.append(np.sum(model.predict(X)))
assert np.all(np.isclose(results, results[0]))
As you can see the assert fails
Corrected code
results = []
for i in range(10):
tf.random.set_seed(1337)
model = get_model()
model.fit(X, y, epochs=5, verbose=0, shuffle=False)
results.append(np.sum(model.predict(X)))
assert np.all(np.isclose(results, results[0]))
I'm using Tensorflow to train a network to predict the third item in a list of numbers.
When I train, the network appears to train quite well and do well on both the training and test set. However, when I evaluate its performance myself, it seems to be doing quite poorly.
For example, at the end of training, Tensorflow says that the validation loss is 2.1 x 10^(-5). However, when I compute it myself, I get 0.17 x 10^0. What am I doing wrong?
Here's code that can be run on Google Colab:
import numpy as np
import tensorflow as tf
from sklearn.model_selection import train_test_split
def create_dataset(k=5, n=2, example_amount=200):
'''Create a dataset of numbers where the goal is to always output the nth number'''
# UPGRADE: this could be done better with numpy to just generate all the examples at once
example_amount = 1000
x = []
y = []
ans = [x, y]
for i in range(example_amount):
example_x = np.random.rand(k)
example_y = example_x[n]
x.append(example_x)
y.append(example_y)
return ans
def tensorize(tensor_like) -> tf.Tensor:
'''Turn stuff into tensors'''
return tf.convert_to_tensor(tensor_like, dtype=tf.float32)
def split_dataset(dataset, train_split=0.8, random_state=42):
'''
Takes in a list (or tuple) where index 0 contains the inputs and index 1 contains the outputs
outputs x_train, x_test, y_train, y_test, train_indexes, test_indexes all as tf.Tensor
'''
indices = np.arange(len(dataset[0]))
return tuple([tensorize(data) for data in train_test_split(dataset[0], dataset[1], indices, train_size=train_split, random_state=random_state)])
# how many numbers in each example
K = 5
# the index of the solution
N = 2
# how many examples
EXAMPLE_AMOUNT = 20000
# what percentage of the examples are in the training set
TRAIN_SPLIT = 0.5
# how long to train for
epochs = 50
dataset = create_dataset(K, N, EXAMPLE_AMOUNT)
x_train, x_test, y_train, y_test, train_indexes, test_indexes = split_dataset(dataset, train_split=TRAIN_SPLIT)
model_input = tf.keras.layers.Input(shape=(K,), name="input")
model_dense1 = tf.keras.layers.Dense(10, name="dense1")(model_input)
model_dense2 = tf.keras.layers.Dense(10, name="dense2")(model_dense1)
model_output = tf.keras.layers.Dense(1, name="output")(model_dense2)
model = tf.keras.Model(inputs=model_input, outputs=model_output)
model.compile(optimizer=tf.keras.optimizers.Adam(), loss="mse")
history = model.fit(x=x_train, y=y_train, validation_data=(x_test, y_test), epochs=epochs)
# the validation loss as Tensorflow computes it
print(history.history["val_loss"][-1]) # 2.1036579710198566e-05
# the validation loss as I compute it
val_loss = tf.math.reduce_mean(tf.keras.losses.MSE(y_test, model.predict(x_test))).numpy()
print(val_loss) # 0.1655631
What you miss is that the shape of y_test.
y_test.numpy().shape
(500,) <-- causing the behaviour
Simply reshape it like:
val_loss = tf.math.reduce_mean(tf.keras.losses.MSE(y_test.numpy().reshape(-1,1), model.predict(x_test))).numpy()
print(val_loss) # 1.1548506e-05
Also:
history.history["val_loss"][-1] # 1.1548506336112041e-05
Or you can flatten() both of the data while calculating it:
val_loss = tf.math.reduce_mean(tf.keras.losses.MSE(y_test.numpy().flatten(), model.predict(x_test).flatten())).numpy()
print(val_loss) # 1.1548506e-05
I have been working with binary sequential inputs and outputs using Tensorflow 2.0, and I've been wondering which approach Tensorflow uses to compute metrics such as recall or accuracy during training in those scenarios.
Each sample to my network consists of 60 timesteps, each with 300 features, and thus my expected output is a (60, 1) array of 1s and 0s. Suppose I have 2000 validation samples. When evaluating the validation set for each epoch, does tensorflow concatenates all of the 2000 samples into a single (2000*60=120000, 1) array and then compares to the concatenated groundtruth labels, or does it evalutes each of the (60, 1) individually and then returns a mean of those values? Is there any way to modify this behavior?
Tensorflow/Keras by default computes the metrics batch-wise for train data, while it computes the same metrics on ALL the data passed in validation_data parameters in fit method.
This means that the metric printed during fitting for the train data is the mean of that score calculated on all the batches. In other words, for trainset keras evaluates each bach individually and then returns a mean of those values. For validation data is different, keras gets all the validation samples and then compares them with the "concatenated" groundtruth labels.
To prove this behavior with code I propose a dummy example. I provide a custom callback that computes for sure the accuracy score on ALL the data passed at the end of the epoch (for train and optionally validation). this is useful for us to understand the behavior of tensorflow during training.
import numpy as np
from sklearn.metrics import accuracy_score
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.callbacks import *
class ACC_custom(tf.keras.callbacks.Callback):
def __init__(self, train, validation=None):
super(ACC_custom, self).__init__()
self.validation = validation
self.train = train
def on_epoch_end(self, epoch, logs={}):
logs['ACC_score_train'] = float('-inf')
X_train, y_train = self.train[0], self.train[1]
y_pred = (self.model.predict(X_train).ravel()>0.5)+0
score = accuracy_score(y_train.ravel(), y_pred)
if (self.validation):
logs['ACC_score_val'] = float('-inf')
X_valid, y_valid = self.validation[0], self.validation[1]
y_val_pred = (self.model.predict(X_valid).ravel()>0.5)+0
val_score = accuracy_score(y_valid.ravel(), y_val_pred)
logs['ACC_score_train'] = np.round(score, 5)
logs['ACC_score_val'] = np.round(val_score, 5)
else:
logs['ACC_score_train'] = np.round(score, 5)
create dummy data
x_train = np.random.uniform(0,1, (1000,60,10))
y_train = np.random.randint(0,2, (1000,60,1))
x_val = np.random.uniform(0,1, (500,60,10))
y_val = np.random.randint(0,2, (500,60,1))
fit model
inp = Input(shape=((60,10)), dtype='float32')
x = Dense(32, activation='relu')(inp)
out = Dense(1, activation='sigmoid')(x)
model = Model(inp, out)
es = EarlyStopping(patience=10, verbose=1, min_delta=0.001,
monitor='ACC_score_val', mode='max', restore_best_weights=True)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(x_train,y_train, epochs=10, verbose=2,
callbacks=[ACC_custom(train=(x_train,y_train),validation=(x_val,y_val)),es],
validation_data=(x_val,y_val))
in the graphs below I make a comparison between the accuracies computed by our callback and the accuracy computed by keras
plt.plot(history.history['ACC_score_train'], label='accuracy_callback_train')
plt.plot(history.history['accuracy'], label='accuracy_default_train')
plt.legend(); plt.title('train accuracy')
plt.plot(history.history['ACC_score_val'], label='accuracy_callback_valid')
plt.plot(history.history['val_accuracy'], label='accuracy_default_valid')
plt.legend(); plt.title('validation accuracy')
as we can see the accuracy on the train data (first plot) is different between the default method and our callbacks. this means that the accuracy of train data is calculated batch-wise.
the validation accuracy (second plot) calculated by our callback and the default method is the same! this means that the score on validation data is computed one-shoot
I'm hoping to get clarification on how the shuffle argument in tf.data.Dataset.list_files() works. The documentation states that when shuffle=True, the filenames will be shuffled randomly. I've made model predictions using a tfrecords dataset that has been loaded using tf.data.Dataset.list_files(), and I would've expected the accuracy metric to be the same no matter the order of the files (i.e. whether shuffle is True or False), but am seeing otherwise.
Is this expected behavior or is there something wrong with my code or intepretation? I have reproducible example code below.
Oddly, as long as tf.random.set_random_seed() is set initially (and it seems it doesn't even matter what seed value is set), then the predictions results are the same no matter whether shuffle is True or False in list_files().
tensorflow==1.13.1, keras==2.2.4
Thanks for any clarifications!
Edit: re-thinking it through and wondering if Y = [y[0] for _ in range(steps) for y in sess.run(Y)] is a separate and independent call?
# Fit and Save a Dummy Model
import numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from sklearn import datasets, metrics
seed = 7
np.random.seed(seed)
tf.random.set_random_seed(seed)
dataset = datasets.load_iris()
X = dataset.data
Y = dataset.target
dummy_Y = np_utils.to_categorical(Y)
# 150 rows
print(len(X))
model = Sequential()
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, dummy_Y, epochs=10, batch_size=10, verbose=2)
model.save('./iris/iris_model')
predictions = model.predict(X)
predictions = np.argmax(predictions, axis=1)
# returns accuracy = 0.3466666666666667
print(metrics.accuracy_score(y_true=Y, y_pred=predictions))
Split dataset into multiple tfrecords files so we can reload it with list_files() later:
numrows = 15
for i, j in enumerate(range(0, len(X), numrows)):
with tf.python_io.TFRecordWriter('./iris/iris{}.tfrecord'.format(i)) as writer:
for x, y in zip(X[j:j+numrows, ], Y[j:j+numrows, ]):
features = tf.train.Features(feature=
{'X': tf.train.Feature(float_list=tf.train.FloatList(value=x)),
'Y': tf.train.Feature(int64_list=tf.train.Int64List(value=[y]))
})
example = tf.train.Example(features=features)
writer.write(example.SerializeToString())
At this point, I exit (ipython) and restart again:
import numpy as np
import tensorflow as tf
from keras.models import load_model
from sklearn import metrics
model = load_model('./iris/iris_model')
batch_size = 10
steps = int(150/batch_size)
file_pattern = './iris/iris*.tfrecord'
feature_description = {
'X': tf.FixedLenFeature([4], tf.float32),
'Y': tf.FixedLenFeature([1], tf.int64)
}
def _parse_function(example_proto):
return tf.parse_single_example(example_proto, feature_description)
def load_data(filenames, batch_size):
raw_dataset = tf.data.TFRecordDataset(filenames)
dataset = raw_dataset.map(_parse_function)
dataset = dataset.batch(batch_size, drop_remainder=True)
dataset = dataset.prefetch(2)
iterator = dataset.make_one_shot_iterator()
record = iterator.get_next()
return record['X'], record['Y']
def get_predictions_accuracy(filenames):
X, Y = load_data(filenames=filenames, batch_size=batch_size)
predictions = model.predict([X], steps=steps)
predictions = np.argmax(predictions, axis=1)
print(len(predictions))
with tf.Session() as sess:
Y = [y[0] for _ in range(steps) for y in sess.run(Y)]
print(metrics.accuracy_score(y_true=Y, y_pred=predictions))
# No shuffle results:
# Returns expected accuracy = 0.3466666666666667
filenames_noshuffle = tf.data.Dataset.list_files(file_pattern=file_pattern, shuffle=False)
get_predictions_accuracy(filenames_noshuffle)
# Shuffle results, no seed value set:
# Returns UNEXPECTED accuracy (non-deterministic value)
filenames_shuffle_noseed = tf.data.Dataset.list_files(file_pattern=file_pattern, shuffle=True)
get_predictions_accuracy(filenames_shuffle_noseed)
# Shuffle results, seed value set:
# Returns expected accuracy = 0.3466666666666667
# It seems like it doesn't even matter what seed value you set, as long as you you set it
seed = 1000
tf.random.set_random_seed(seed)
filenames_shuffle_seed = tf.data.Dataset.list_files(file_pattern=file_pattern, shuffle=True)
get_predictions_accuracy(filenames_shuffle_seed)