I'm trying to calculate mse using tensorflow and compare the result with sklearn.metrics.mean_squared_error method.
def mse(y,y_hat):
return tf.reduce_mean(tf.squared_difference(y, y_hat)).eval()
compute_mse = lambda vector1, vector2: mse(vector1,vector2)
My test loop
for n in [1,5,10,10**3]:
elems = [np.arange(n),np.arange(n,0,-1), np.zeros(n),
np.ones(n),np.random.random(n),np.random.randint(100,size=n)]
for el in elems:
for el_2 in elems:
true_mse = np.array(mean_squared_error(el,el_2))
my_mse = compute_mse(el,el_2)
if not np.allclose(true_mse,my_mse):
print('Wrong result:')
print("All tests passed")
But my tf function always return either 0 or 1.
Could you please point me where I'm wrong.
UPD
Thanks to #apnorton for pointing on issue with types.
def mse(y,y_hat):
y_ = tf.Variable(y, tf.float64)
y_hat_ = tf.Variable(y_hat, tf.float64)
return tf.reduce_mean(tf.squared_difference(y_, y_hat_).eval()
If you print all the outputs of your tf function, you'll see it doesn't return only 1s and 0s, but it does only return integers. This is because the values of elems are all of type numpy.int32. The sklearn function appears to cast these as floats when doing the mean step, while the Tensor Flow approach does not.
To see a fixed variant, consider changing the compute_mse line to:
my_mse = compute_mse(el.astype(float),el_2.astype(float))
Edit: In response to the question in the comments, I'd avoid creating a variable only for the purpose of a cast. Instead, I'd recommend using the tf.to_float method:
def mse(y,y_hat):
return tf.reduce_mean(tf.squared_difference(tf.to_float(y), tf.to_float(y_hat))).eval()
Related
This is a question I've had before: I have two arrays representing the inputs and corresponding outputs of a function. I need to find the input for a specific output that falls between data points. How do I do that?
For example:
import numpy as np
B = np.arange(0,10,1)
def fun(b):
return b*3/5
A = fun(B)
How to get the value of "B" for fun to return 3.75?
This technique uses linear interpolation to approximate.
I start with this function:
def interpABS(A,B,Aval):
if Aval>max(A) or Aval<min(A):
print('Error: Extrapolating beyond given data')
else:
if len(A)==len(B):
for i in np.arange(1,len(A),1):
ihi = i
ilo = i-1
if A[i]>Aval:
break
Alo = A[ilo]
Blo = B[ilo]
Ahi = A[ihi]
Bhi = B[ihi]
out = Blo + (Bhi-Blo)*(Aval-Alo)/(Ahi-Alo)
return out
else:
print('Error: inputs of different sizes')
Note: I'm kind of an amateur and don't know how to set up exceptions, so instead the error outputs are just print commands on a different path from the rest of the function. Those more experienced than I am may recommend improvements.
Use the output array from your function as A, and the corresponding input array as B, then input your target value as Aval. interpABS will return the an approximate input for your original function to get the target value
So, for our example above, interpABS(A,B,3.75) will return a value of 6.25
This can be useful even if Aval is a value of A to find the corresponding B value, since the math simplifies to Blo + 0. For example, changing Aval in the above example will give 5.0, which is part of the original input set B.
I'm trying to execute these functions
def evaluate(sentence):
sentence = preprocess_sentence(sentence)
sentence = tf.expand_dims(
START_TOKEN + tokenizer.encode(sentence) + END_TOKEN, axis=0)
output = tf.expand_dims(START_TOKEN, 0)
for i in range(MAX_LENGTH):
predictions = model(inputs=[sentence, output], training=False)
# select the last word from the seq_len dimension
predictions = predictions[:, -1:, :]
predicted_id = tf.cast(tf.argmax(predictions, axis=-1), tf.int32)
# return the result if the predicted_id is equal to the end token
if tf.equal(predicted_id, END_TOKEN[0]):
break
#check()
#tf.cond(tf.equal(predicted_id, END_TOKEN[0]),true_fn=break,false_fn=lambda: tf.no_op())
# concatenated the predicted_id to the output which is given to the decoder
# as its input.
output = tf.concat([output, predicted_id], axis=-1)
return tf.squeeze(output, axis=0)
def predict(sentence):
prediction = evaluate(sentence)
predicted_sentence = tokenizer.decode(
[i for i in prediction if i < tokenizer.vocab_size])
print('Input: {}'.format(sentence))
print('Output: {}'.format(predicted_sentence))
return predicted_sentence
however, I'm having the following error:
OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I do understand that I have to rewrite the if condtion in a form of tf.cond(). however, i don't know how to write break in tensor flow, also I'm not sure which if condition is causing the problem as the same function exactly in this notebook is working properly?
https://colab.research.google.com/github/tensorflow/examples/blob/master/community/en/transformer_chatbot.ipynb#scrollTo=_NURhwYz5AXa
Any help?
The code in the notebook works because it uses TF 2.0, which has eager execution enabled by default. You can turn it on in older versions with tf.enable_eager_execution.
Alternatively, you can use break in graph mode without writing tf.cond if you use tf.function or tf.autograph, but they have some restrictions on the code you can run.
There is nothing wrong with the break statement. The problem is elsewhere.
if tf.equal(predicted_id, END_TOKEN[0]):
break
will give error something about using Python bool in tensor ops. Since you have already used tf.equal condition this could be confusing. The solution is simple. The error is being thrown for the
if (boolean): python syntax.
You would have to take care of this (bool) Python syntax and convert to tensor-style, based on what you are planning to achieve. Remember, the condition returns a tensor of boolean values. Read this tensor and then proceed to do what you want.. So for e.g. below would work unconditionally irrespective of the value of the condition:
if tf.equal(predicted_id, END_TOKEN[0]) is not None:
break
I am creating neural nets with Tensorflow and skflow; for some reason I want to get the values of some inner tensors for a given input, so I am using myClassifier.get_layer_value(input, "tensorName"), myClassifier being a skflow.estimators.TensorFlowEstimator.
However, I find it difficult to find the correct syntax of the tensor name, even knowing its name (and I'm getting confused between operation and tensors), so I'm using tensorboard to plot the graph and look for the name.
Is there a way to enumerate all the tensors in a graph without using tensorboard?
You can do
[n.name for n in tf.get_default_graph().as_graph_def().node]
Also, if you are prototyping in an IPython notebook, you can show the graph directly in notebook, see show_graph function in Alexander's Deep Dream notebook
I'll try to summarize the answers:
To get all nodes in the graph: (type tensorflow.core.framework.node_def_pb2.NodeDef)
all_nodes = [n for n in tf.get_default_graph().as_graph_def().node]
To get all ops in the graph: (type tensorflow.python.framework.ops.Operation)
all_ops = tf.get_default_graph().get_operations()
To get all variables in the graph: (type tensorflow.python.ops.resource_variable_ops.ResourceVariable)
all_vars = tf.global_variables()
To get all tensors in the graph: (type tensorflow.python.framework.ops.Tensor)
all_tensors = [tensor for op in tf.get_default_graph().get_operations() for tensor in op.values()]
To get all placeholders in the graph: (type tensorflow.python.framework.ops.Tensor)
all_placeholders = [placeholder for op in tf.get_default_graph().get_operations() if op.type=='Placeholder' for placeholder in op.values()]
Tensorflow 2
To get the graph in Tensorflow 2, instead of tf.get_default_graph() you need to instantiate a tf.function first and access the graph attribute, for example:
graph = func.get_concrete_function().graph
where func is a tf.function
There is a way to do it slightly faster than in Yaroslav's answer by using get_operations. Here is a quick example:
import tensorflow as tf
a = tf.constant(1.3, name='const_a')
b = tf.Variable(3.1, name='variable_b')
c = tf.add(a, b, name='addition')
d = tf.multiply(c, a, name='multiply')
for op in tf.get_default_graph().get_operations():
print(str(op.name))
tf.all_variables() can get you the information you want.
Also, this commit made today in TensorFlow Learn that provides a function get_variable_names in estimator that you can use to retrieve all variable names easily.
I think this will do too:
print(tf.contrib.graph_editor.get_tensors(tf.get_default_graph()))
But compared with Salvado and Yaroslav's answers, I don't know which one is better.
The accepted answer only gives you a list of strings with the names. I prefer a different approach, which gives you (almost) direct access to the tensors:
graph = tf.get_default_graph()
list_of_tuples = [op.values() for op in graph.get_operations()]
list_of_tuples now contains every tensor, each within a tuple. You could also adapt it to get the tensors directly:
graph = tf.get_default_graph()
list_of_tuples = [op.values()[0] for op in graph.get_operations()]
Since the OP asked for the list of the tensors instead of the list of operations/nodes, the code should be slightly different:
graph = tf.get_default_graph()
tensors_per_node = [node.values() for node in graph.get_operations()]
tensor_names = [tensor.name for tensors in tensors_per_node for tensor in tensors]
Previous answers are good, I'd just like to share a utility function I wrote to select Tensors from a graph:
def get_graph_op(graph, and_conds=None, op='and', or_conds=None):
"""Selects nodes' names in the graph if:
- The name contains all items in and_conds
- OR/AND depending on op
- The name contains any item in or_conds
Condition starting with a "!" are negated.
Returns all ops if no optional arguments is given.
Args:
graph (tf.Graph): The graph containing sought tensors
and_conds (list(str)), optional): Defaults to None.
"and" conditions
op (str, optional): Defaults to 'and'.
How to link the and_conds and or_conds:
with an 'and' or an 'or'
or_conds (list(str), optional): Defaults to None.
"or conditions"
Returns:
list(str): list of relevant tensor names
"""
assert op in {'and', 'or'}
if and_conds is None:
and_conds = ['']
if or_conds is None:
or_conds = ['']
node_names = [n.name for n in graph.as_graph_def().node]
ands = {
n for n in node_names
if all(
cond in n if '!' not in cond
else cond[1:] not in n
for cond in and_conds
)}
ors = {
n for n in node_names
if any(
cond in n if '!' not in cond
else cond[1:] not in n
for cond in or_conds
)}
if op == 'and':
return [
n for n in node_names
if n in ands.intersection(ors)
]
elif op == 'or':
return [
n for n in node_names
if n in ands.union(ors)
]
So if you have a graph with ops:
['model/classifier/dense/kernel',
'model/classifier/dense/kernel/Assign',
'model/classifier/dense/kernel/read',
'model/classifier/dense/bias',
'model/classifier/dense/bias/Assign',
'model/classifier/dense/bias/read',
'model/classifier/dense/MatMul',
'model/classifier/dense/BiasAdd',
'model/classifier/ArgMax/dimension',
'model/classifier/ArgMax']
Then running
get_graph_op(tf.get_default_graph(), ['dense', '!kernel'], 'or', ['Assign'])
returns:
['model/classifier/dense/kernel/Assign',
'model/classifier/dense/bias',
'model/classifier/dense/bias/Assign',
'model/classifier/dense/bias/read',
'model/classifier/dense/MatMul',
'model/classifier/dense/BiasAdd']
The following solution works for me in TensorFlow 2.3 -
def load_pb(path_to_pb):
with tf.io.gfile.GFile(path_to_pb, 'rb') as f:
graph_def = tf.compat.v1.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
tf.import_graph_def(graph_def, name='')
return graph
tf_graph = load_pb(MODEL_FILE)
sess = tf.compat.v1.Session(graph=tf_graph)
# Show tensor names in graph
for op in tf_graph.get_operations():
print(op.values())
where MODEL_FILE is the path to your frozen graph.
Taken from here.
This worked for me:
for n in tf.get_default_graph().as_graph_def().node:
print('\n',n)
This code to calculate the trace of a matrix (based on an example in the Theano "loop" tutorial) works fine:
import numpy as np
import theano as th
import theano.tensor as T
floatX = 'float32'
X = T.matrix()
results = th.scan(lambda i,j,t_f : T.cast(X[i,j] + t_f, floatX),
sequences=[T.arange(X.shape[0]), T.arange(X.shape[1])],
outputs_info=np.asarray(0., dtype=floatX))[0]
result = results[-1]
compute_trace = th.function([X], result)
x = np.eye(5, dtype=floatX)
x[0] = np.arange(5, dtype=floatX)
print compute_trace(x)
But if I remove the cast operation from the lambda function like this:
lambda i,j,t_f : X[i,j] + t_f
The following error message is produced:
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (outputs_info in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 2) has dtype float32, while the result of the inner function (fn) has dtype float64. This can happen if the inner function of scan results in an upcast or downcast.
Why so? X and outputs_info are explicitly set to float32. How does the result of adding them get to be float64?
This is a very late answer, but we're working on a fork of Theano called Aesara, and, since people still run into problems like this, it seems worthwhile to provide a public explanation.
That said, the issue is X = T.matrix(). T.matrix creates a float64 matrix when theano.config.floatX == "float64" (the default), and the result is an upcast to float64 for the sum in the body of the scan's loop function.
If X = T.fmatrix() is used, a float32 matrix is created instead and the problem is no longer present; otherwise, as mentioned in the comments, one can also set theano.config.floatX to "float32".
I'm working on a ANN, and got it working on simple AND/XOR problems using gradient descent, but I am finding it much more difficult to find the optimal weights for more complex problems. My solution was to use Scipy's minimum function instead, but I am having some trouble.
My ANN is a class, and when ANN object is created, weights are created upon initialization as the data member self.__weights, this is useful because I don't actually have to pass a weights parameter into any of the methods used on the class, and it worked in my earlier non-Scipy using implementation.
I use the self.__weights as the initial guess for the minimum function, and pass in the other arguments(input data, the correct output values) needed to make the function work. But it seems, that the optimizer keeps passing the weights into the methods(Cost Function/FeedForward/Backprop) as a parameter when there is no need to since it can be called with self.__weights. For instance, when my FeedForward method is called, it passes in the weights, and states that the input isn't of the correct size, this is because the weights were passed in instead. And the input data is passed into the minimum's optional 'arg=' parameter.
So, all explanation done, must the value to be optimized actually be a parameter of the methods being used to optimized it, and not just a class data member that can be called from anywhere within in the class? I've tried to find examples using a class they pass in there 'weights' or other value to be optimized depending on there use, not a data member of the class.
EDIT:
This is code a snippet is located in my train method. It's parameters are input_data and target.
opt_theta = minimize(self.CostFunction,x0=self.__weights,args = (input_data,target),method='Newton-CG',jac= self.Weight_Grad)
print(opt_theta)
self.__weights = opt_theta.x
self.CostFunction and self.Weight_Grad both take those same two parameters. Both of these latter two functions use a Feedforward method to get the output of the network, and do there respective task with that info. Whenever I call the minimize function it seems it passes self.__weights into self.CostFunction as the input_data parameter, which then gets passed into the FeedForward method as input and I get an error. IF i print the "input" passed into Feedforward it is the value of self.__weights. This is how I know self.__weights is getting passed in as a parameter when it shouldn't.
So, I thought I would create a "dummy"? parameter to pass the the self.__weights value into for all of the methods('self.CostFunction,'self.Weight_Grad,Feedforward) and an error did not occur but there was no change in the weights or the output. What do I need to do to get self.__weights to update.
Here are the methods in case it helps:
def Feedforward(self,input):
#Code to take self.__weights and convert to list of matrices.
weight_matrix_list = []
prev_val = 0
for i in range(len(self.__weight_sizes)):
curr_w_size = self.__weight_sizes[i]
weight_count = curr_w_size[0]*curr_w_size[1]
matrix_elements = self.__weights[prev_val:prev_val+weight_count]
weight_matrix_list.append(matrix_elements.reshape(curr_w_size))
self.__input_cases = np.shape(input)[0]
#Empty list to hold the output of every layer.
output_list = []
#Appends the output of the the 1st input layer.
output_list.append(input)
for i in range(self.__layers-1):
if i == 0:
print(self.__input_cases)
print(input)
X = np.concatenate((np.ones((self.__input_cases,1)),input),1)
output = self.sigmoid(np.dot(X,weight_matrix_list[0].T))
output_list.append(output)
else:
output = self.sigmoid(np.dot(np.concatenate((np.ones((self.__input_cases,1)),output),1),weight_matrix_list[i].T))
output_list.append(output)
return output_list
def CostFunction(self,input_data,target,error_func=1):
"""Gives the cost of using a particular weight matrix
based off of the input and targeted output"""
print("Cost")
#Run the network to get output using current theta matrices.
output = self.Feedforward(input_data)[-1]
#Determines number of input/training examples
m = np.shape(input_data)[0]
#####Allows user to choose Cost Functions.#####
#
#Log Based Error Function
#
if error_func == 0:
error = np.multiply(-target,np.log(output))-np.multiply((1-target),np.log(1-output))
total_error = np.sum(np.sum(error))
#
#Squared Error Cost Function
#
elif error_func == 1:
error = (target - output)**2
total_error = (1/(2*m)) * np.sum(np.sum(error))
return total_error
def Weight_Grad(self,input_data,target):
print('Grad')
weight_matrix_list = []
prev_val = 0
for i in range(len(self.__weight_sizes)):
curr_w_size = self.__weight_sizes[i]
weight_count = curr_w_size[0]*curr_w_size[1]
matrix_elements = self.__weights[prev_val:prev_val+weight_count]
weight_matrix_list.append(matrix_elements.reshape(curr_w_size))
output_list = self.Feedforward(theta,input_data)
#Finds the Deltas for Each Layer
#
deltas = []
for i in range(self.__layers - 1):
#Finds Error Delta for the last layer
if i == 0:
error = (target-output_list[-1])
error_delta = -1*np.multiply(error,np.multiply(output_list[-1],(1-output_list[-1])))
deltas.append(error_delta)
#Finds Error Delta for the hidden layers
else:
#Weight matrices have bias values removed
error_delta = np.multiply(np.dot(deltas[-1],weight_matrix_list[-i][:,1:]),output_list[-i-1]*(1-output_list[-i-1]))
deltas.append(error_delta)
#
#Finds the Deltas for each Weight Matrix
#
Weight_Delta_List = []
deltas.reverse()
for i in range(len(weight_matrix_list)):
current_weight_delta = (1/self.__input_cases) * np.dot(deltas[i].T,np.concatenate((np.ones((self.__input_cases,1)),output_list[i]),1))
Weight_Delta_List.append(current_weight_delta)
#
#Converts Weight Delta List to a single vector
#
Weight_Delta_Vector = np.array([])
for i in range(len(Weight_Delta_List)):
Weight_Delta_Vector = np.concatenate((Weight_Delta_Vector,Weight_Delta_List[i].flatten()))
print("WDV Shape:",np.shape(Weight_Delta_Vector))
return Weight_Delta_Vector
def Train(self,input_data,target):
opt_theta = minimize(self.CostFunction,x0=self.__weights,args = (input_data,target),method='Newton-CG',jac= self.Weight_Grad)
print(opt_theta)
self.__weights = opt_theta.x
print("Done")