Iterate over tensor in a custom loss function - python

I need to use this loss function for a CNN the list_distance and list_residual are output tensors from hidden layers which are important to compute the loss, but when i execute the code it gives me back this error
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.
Is there another way to iterate over tensors without the use of the costruct
x in X or convert it in a numpy array or using the backend function of keras?
def DBL(y_true, y_pred, list_distances, list_residual, l=0.65):
prob_dist = []
Li = []
# mean of the images power spectrum
S = np.sum([np.power(np.abs(fp.fft2(residual)), 2)
for residual in list_residual], axis=0) / K.shape(list_residual)[0]
# log-ratio between the geometric and arithmetic of S
R = np.log10((scistats.gmean(S) / np.mean(S)))
for c_i, dis_i in enumerate(list_distances):
prob_dist.append([
np.exp(-dis_i) / sum([np.exp(-dis_j) if c_j != c_i else 0 for c_j, dis_j in enumerate(list_distances)])
])
for count, _ in enumerate(prob_dist):
Li.append(
-1 * np.log10(sum([p_j for c_j, p_j in enumerate(prob_dist[count])
if y_pred[count] == 1 and count != c_j])))
L0 = np.sum(Li)
return L0 - l * R

You need to define a custom function to feed into tf.map_fn() - Tensorflow dox
Mapper functions map (funnily enough) the existing object (tensor) into a new one using a function you define.
They apply the custom function to every element in the object, without all the mucking about with for loops.
For instance (non tested code, may not run - on my phone atm):
def custom(a):
b = a + 1
return b
original = np.array([2,2,2])
mapped = tf.map_fn(custom, original)
# mapped == [3, 3, 3] ... hopefully
Tensorflow examples all use lambda functions, so you might need to define your functions like that if the above doesn’t work. Tensorflow example:
elems = np.array([1, 2, 3, 4, 5, 6])
squares = map_fn(lambda x: x * x, elems)
# squares == [1, 4, 9, 16, 25, 36]
Edit:
As an aside, map functions are much easier to parallelise than for loops - it is assumed that each element of an object is processed uniquely - so you can see a performance uplift by using them.
Edit 2:
For the "reduce sum, but not on this index" part, I would heavily recommend you start looking back at matrix operations... As mentioned, map functions work element-wise - they are not aware of other elements. A reduce function is what you want, but even they are finiky when you try and do "not this index" sums... also tensorflow is built around matrix ops... Not the MapReduce paradigm.
Something along these lines might help:
sess = tf.Session()
var = np.ones([3, 3, 3]) * 5
zero_identity = tf.linalg.set_diag(
var, tf.zeros(var.shape[0:-1], dtype=tf.float64)
)
exp_one = tf.exp(var)
exp_two = tf.exp(zero_identity)
summed = tf.reduce_sum(exp_two, axis = [0,1])
final = exp_one / summed
print("input matrix: \n", var, "\n")
print("Identities of the matrix to Zero: \n", zero_identity.eval(session=sess), "\n")
print("Exponential Values numerator: \n", exp_one.eval(session=sess), "\n")
print("Exponential Values to Sum: \n", exp_two.eval(session=sess), "\n")
print("Summed values for zero identity matrix\n ... along axis [0,1]: \n", summed.eval(session=sess), "\n")
print("Output:\n", final.eval(session=sess), "\n")

Related

Storing parameter values in every step of the custom gradient descent algorithm in Python

I'm trying to make custom gradient descent estimator, however, I am encountering the issue with storing the parameter values at every step of the gradient descent algorithm. Here is the code skeleton:
from numpy import *
import pandas as pd
from joblib import Parallel, delayed
from multiprocessing import cpu_count
ftemp = zeros((2, ))
stemp = empty([1, ], dtype='<U10')
la = 10
vals = pd.DataFrame(index=range(la), columns=['a', 'b', 'string']
def sfun(k1, k2, k3, string):
a = k1*k2
b = k2*k3
s = string
nums = [a, b]
strs = [s]
return(nums, strs)
def store(inp):
r = rsfun(inp[0], inp[1], inp[2], inp[3])
ftemp = append(ftemp, asarray(r[0]), axis = 0)
stemp = append(stemp, asarray(r[1]), axis = 0)
return(ftemp, stemp)
for l in range(la):
inputs = [(2, 3, 4, 'he'),
(4, 6, 2, 'je'),
(2, 7, 5, 'ke')]
Parallel(n_jobs = cpu_count)(delayed(store)(i) for i in inputs)
vals.iloc[l, 0:2] = ftemp[0, 0], ftemp[0, 1]
vals.iloc[l, 2] = stemp[0]
d = ftemp[2, 0]-ftemp[0, 0]
Note: most of the gradient descent stuff is removed because I do not have any issues with that. the main issues that I have are storing the values at each step.
sfun() is the loss function (I know that it doesn't look like that here) and store() is just an attempt to store the parameter values with each step.
The important aspect here is that I want to parallelize the process as sfun() is computationally expensive and the issue with that I want to save values for all parallel runs.
I tried solving this in many different ways, but I always get a different error.
No need to make a temporary storage array, possible to store the results of Parallel() function directly by:
a = Parallel(n_jobs = cpu_count)(delayed(store)(i) for i in inputs)
Most importantly, a is populated in order that the inputs are given.

Is it possible to convert this numpy function to tensorflow?

I have a function that takes a [32, 32, 3] tensor, and outputs a [256,256,3] tensor.
Specifically, the function interprets the smaller array as if it was a .svg file, and 'renders' it to a 256x256 array as a canvas using this algorithm
For an explanation of WHY I would want to do this, see This question
The function behaves exactly as intended, until I try to include it in the training loop of a GAN. The current error I'm seeing is:
NotImplementedError: Cannot convert a symbolic Tensor (mul:0) to a numpy array.
A lot of other answers to similar errors seem to boil down to "You need to re-write the function using tensorflow, not numpy"
Here's the working code using numpy - is it possible to re-write it to exclusively use tensorflow functions?
def convert_to_bitmap(input_tensor, target, j):
#implied conversion to nparray - the tensorflow docs seem to indicate this is okay, but the error is thrown here when training
array = input_tensor
outputArray = target
output = target
for i in range(32):
col = float(array[i,0,j])
if ((float(array[i,0,0]))+(float(array[i,0,1]))+(float(array[i,0,2]))/3)< 0:
continue
#slice only the red channel from the i line, multiply by 255
red_array = array[i,:,0]*255
#slice only the green channel, multiply by 255
green_array = array[i,:,1]*255
#combine and flatten them
combined_array = np.dstack((red_array, green_array)).flatten()
#remove the first two and last two indices of the combined array
index = [0,1,62,63]
clipped_array = np.delete(combined_array,index)
#filter array to remove values less than 0
filtered = clipped_array > 0
filtered_array = clipped_array[filtered]
#check array has an even number of values, delete the last index if it doesn't
if len(filtered_array) % 2 == 0:
pass
else:
filtered_array = np.delete(filtered_array,-1)
#convert into a set of tuples
l = filtered_array.tolist()
t = list(zip(l, l[1:] + l[:1]))
if not t:
continue
output = fill_polygon(t, outputArray, col)
return(output)
The 'fill polygon' function is copied from the 'mahotas' library:
def fill_polygon(polygon, canvas, color):
if not len(polygon):
return
min_y = min(int(y) for y,x in polygon)
max_y = max(int(y) for y,x in polygon)
polygon = [(float(y),float(x)) for y,x in polygon]
if max_y < canvas.shape[0]:
max_y += 1
for y in range(min_y, max_y):
nodes = []
j = -1
for i,p in enumerate(polygon):
pj = polygon[j]
if p[0] < y and pj[0] >= y or pj[0] < y and p[0] >= y:
dy = pj[0] - p[0]
if dy:
nodes.append( (p[1] + (y-p[0])/(pj[0]-p[0])*(pj[1]-p[1])) )
elif p[0] == y:
nodes.append(p[1])
j = i
nodes.sort()
for n,nn in zip(nodes[::2],nodes[1::2]):
nn += 1
canvas[y, int(n):int(nn)] = color
return(canvas)
NOTE: I'm not trying to get someone to convert the whole thing for me! There are some functions that are pretty obvious (tf.stack instead of np.dstack), but others that I don't even know how to start, like the last few lines of the fill_polygon function above.
Yes you can actually do this, you can use a python function in sth called tf.pyfunc. Its a python wrapper but its extremely slow in comparison to plain tensorflow. However, tensorflow and Cuda for example are so damn fast because they use stuff like vectorization, meaning you can rewrite a lot , really many of the loops in terms of mathematical tensor operations which are very fast.
In general:
If you want to use custom code as a custom layer, i would recommend you to rethink the algebra behind those loops and try to express them somehow different. If its just preprocessing before the training is going to start, you can use tensorflow but doing the same with numpy and other libraries is easier.
To your main question: Yes its possible, but better dont use loops. Tensorflow has a build-in loop optimizer but then you have to use tf.while() and thats anyoing (maybe just for me). I just blinked over your code, but it looks like you should be able to vectorize it quite good using the standard tensorflow vocabulary. If you want it fast, i mean really fast with GPU support write all in tensorflow, but nothing like 50/50 with tf.convert_to_tensor(), because than its going to be slow again. because than you switch between GPU and CPU and plain Python interpreter and the tensorflow low level API. Hope i could help you at least a bit
This code 'works', in that it only uses tensorflow functions, and does allow the model to train when used in a training loop:
def convert_image (x):
#split off the first column of the generator output, and store it for later (remove the 'colours' column)
colours_column = tf.slice(img_to_convert, tf.constant([0,0,0], dtype=tf.int32), tf.constant([32,1,3], dtype=tf.int32))
#split off the rest of the data, only keeping R + G, and discarding B
image_data_red = tf.slice(img_to_convert, tf.constant([0,1,0], dtype=tf.int32), tf.constant([32,31,1], dtype=tf.int32))
image_data_green = tf.slice(img_to_convert, tf.constant([0,1,1], dtype=tf.int32), tf.constant([32, 31,1], dtype=tf.int32))
#roll each row by 1 position, and make two more 2D tensors
rolled_red = tf.roll(image_data_red, shift=-1, axis=0)
rolled_green = tf.roll(image_data_green, shift=-1, axis=0)
#remove all values where either the red OR green channels are 0
zeroes = tf.constant(0, dtype=tf.float32)
#this is for the 'count_nonzero' command
boolean_red_data = tf.not_equal(image_data_red, zeroes)
boolean_green_data = tf.not_equal(image_data_green, zeroes)
initial_data_mask = tf.logical_and(boolean_red_data, boolean_green_data)
#count non-zero values per row and flatten it
count = tf.math.count_nonzero(initial_data_mask, 1)
count_flat = tf.reshape(count, [-1])
flat_red = tf.reshape(image_data_red, [-1])
flat_green = tf.reshape(image_data_green, [-1])
boolean_red = tf.math.logical_not(tf.equal(flat_red, tf.zeros_like(flat_red)))
boolean_green = tf.math.logical_not(tf.equal(flat_green, tf.zeros_like(flat_red)))
mask = tf.logical_and(boolean_red, boolean_green)
flat_red_without_zero = tf.boolean_mask(flat_red, mask)
flat_green_without_zero = tf.boolean_mask(flat_green, mask)
# create a ragged tensor
X0_ragged = tf.RaggedTensor.from_row_lengths(values=flat_red_without_zero, row_lengths=count_flat)
Y0_ragged = tf.RaggedTensor.from_row_lengths(values=flat_green_without_zero, row_lengths=count_flat)
#do the same for the rolled version
rolled_data_mask = tf.roll(initial_data_mask, shift=-1, axis=1)
flat_rolled_red = tf.reshape(rolled_red, [-1])
flat_rolled_green = tf.reshape(rolled_green, [-1])
#from SO "shift zeros to the end"
boolean_rolled_red = tf.math.logical_not(tf.equal(flat_rolled_red, tf.zeros_like(flat_rolled_red)))
boolean_rolled_green = tf.math.logical_not(tf.equal(flat_rolled_green, tf.zeros_like(flat_rolled_red)))
rolled_mask = tf.logical_and(boolean_rolled_red, boolean_rolled_green)
flat_rolled_red_without_zero = tf.boolean_mask(flat_rolled_red, rolled_mask)
flat_rolled_green_without_zero = tf.boolean_mask(flat_rolled_green, rolled_mask)
# create a ragged tensor
X1_ragged = tf.RaggedTensor.from_row_lengths(values=flat_rolled_red_without_zero, row_lengths=count_flat)
Y1_ragged = tf.RaggedTensor.from_row_lengths(values=flat_rolled_green_without_zero, row_lengths=count_flat)
#available outputs for future use are:
X0 = X0_ragged.to_tensor(default_value=0.)
Y0 = Y0_ragged.to_tensor(default_value=0.)
X1 = X1_ragged.to_tensor(default_value=0.)
Y1 = Y1_ragged.to_tensor(default_value=0.)
#Example tensor cel (replace with (x))
P = tf.cast(x, dtype=tf.float32)
#split out P.x and P.y, and fill a ragged tensor to the same shape as Rx
Px_value = tf.cast(x, dtype=tf.float32) - tf.cast((tf.math.floor(x/255)*255), dtype=tf.float32)
Py_value = tf.cast(tf.math.floor(x/255), dtype=tf.float32)
Px = tf.squeeze(tf.ones_like(X0)*Px_value)
Py = tf.squeeze(tf.ones_like(Y0)*Py_value)
#for each pair of values (Y0, Y1, make a vector, and check to see if it crosses the y-value (Py) either up or down
a = tf.math.less(Y0, Py)
b = tf.math.greater_equal(Y1, Py)
c = tf.logical_and(a, b)
d = tf.math.greater_equal(Y0, Py)
e = tf.math.less(Y1, Py)
f = tf.logical_and(d, e)
g = tf.logical_or(c, f)
#Makes boolean bitwise mask
#calculate the intersection of the line with the y-value, assuming it intersects
#P.x <= (G.x - R.x) * (P.y - R.y) / (G.y - R.y + R.x) - use tf.divide_no_nan for safe divide
h = tf.math.less(Px,(tf.math.divide_no_nan(((X1-X0)*(Py-Y0)),(Y1-Y0+X0))))
#combine using AND with the mask above
i = tf.logical_and(g,h)
#tf.count_nonzero
#reshape to make a column tensor with the same dimensions as the colours
#divide by 2 using tf.floor_mod (returns remainder of division - any remainder means the value is odd, and hence the point is IN the polygon)
final_count = tf.cast((tf.math.count_nonzero(i, 1)), dtype=tf.int32)
twos = tf.ones_like(final_count, dtype=tf.int32)*tf.constant([2], dtype=tf.int32)
divide = tf.cast(tf.math.floormod(final_count, twos), dtype=tf.int32)
index = tf.cast(tf.range(0,32, delta=1), dtype=tf.int32)
clipped_index = divide*index
sort = tf.sort(clipped_index)
reverse = tf.reverse(sort, [-1])
value = tf.slice(reverse, [0], [1])
pair = tf.constant([0], dtype=tf.int32)
slice_tensor = tf.reshape(tf.stack([value, pair, pair], axis=0),[-1])
output_colour = tf.slice(colours_column, slice_tensor, [1,1,3])
return output_colour
This is where the 'convert image' function is applied using tf.vectorize_map:
def convert_images(image_to_convert):
global img_to_convert
img_to_convert = image_to_convert
process_list = tf.reshape((tf.range(0,65536, delta=1, dtype=tf.int32)), [65536, 1])
output_line = tf.vectorized_map(convert_image, process_list)
output_line_squeezed = tf.squeeze(output_line)
output_reshape = (tf.reshape(output_line_squeezed, [256,256,3])/127.5)-1
output = tf.expand_dims(output_reshape, axis=0)
return output
It is PAINFULLY slow, though - It does not appear to be using the GPU, and looks to be single threaded as well.
I'm adding it as an answer to my own question because is clearly IS possible to do this numpy function entirely in tensorflow - it just probably shouldn't be done like this.

Tensorflow gradient through while_loop

I've got a tensorflow model where the output of a layer is a 2d tensor, say t = [[1,2], [3,4]].
The next layer expects an input which consists of every row combination of this tensor. That is, I need to turn it into t_new = [[1,2,1,2], [1,2,3,4], [3,4,1,2], [3,4,3,4]].
So far I have tried:
1) tf.unstack(t, axis=0) loop over it's rows and append each combination to a buffer, then t_new = tf.stack(buffer, axis=0). This works except when the shape is unspecified, ie. None so...
2) I have used a tf.while_loop to generate indices idx=[[0,0], [0,1], [1,0], [1,1]], then t_new = tf.gather(t, idx).
My question here is: should I set back_prop to True or False in this tf.while_loop? I'm only generating indices inside the loop. Not sure what back_prop would even mean.
Also, do you know of a better way to achieve what I need?
Here is the while_loop:
i = tf.constant(0)
j = tf.constant(0)
idx = tf.Variable([], dtype=tf.int32)
def body(i, j, idx):
c = tf.concat([idx, [i, j]], axis=0)
i, j = tf.cond(tf.equal(j, sentence_len - 1),
lambda: (i + 1, 0),
lambda: (i, j + 1))
return i, j, c
_, _, indices = tf.while_loop(lambda i, j, _: tf.less(i, sentence_len),
body,
[i, j, idx],
shape_invariants=[i.get_shape(),
j.get_shape(),
tf.TensorShape([None])])
Now I can do t_new = tf.gather(t, indices).
But I am very confused about the meaning of tf.while_loop's back_prop - in general and especially here.
In this case you are fine to have back_prop as false. It doesn't need to back propagate through the computation of the indices because that computation doesn't depend on any learned variables.
It depends on the context. If you are indexing over some features that are produced from a differentiable function then you want to backpropagate. However, if you are indexing over some input placeholder or input data of some type then you can keep it as false, just as #Aaron said.

assign values to an array in a loop in tensorflow

I have a array of ones in tensorflow and I want to update its values based on another array in a for loop. Here is the code:
def get_weights(labels, class_ratio=0.5):
weights = tf.ones_like(labels, dtype=tf.float64))
pos_num = class_ratio * 100
neg_num = 100 - class_ratio * 100
for i in range(labels.shape[0]):
if labels[i] == 0:
weights[i].assign(pos_num/neg_num)
else:
weights[i].assign(neg_num)
return weights
an then I have this code to call the above function:
with tf.Graph().as_default():
labels = tf.placeholder(tf.int32, (5,))
example_weights = get_weights(labels, class_ratio=0.1)
with tf.Session() as sess:
np_labels = np.random.randint(0, 2, 5)
np_weights = sess.run(example_weights, feed_dict={labels: np_labels})
print("Labels: %r" % (np_labels,))
print("Weights: %r" % (np_weights,))
but when I run it, it gives me this error:
ValueError: Sliced assignment is only supported for variables
How can I assign/update values of an array in tensorflow?
A tf.Tensor in TensorFlow is a read-only value—in fact, a symbolic expression for computing a read-only value—so you cannot in general assign values to it. (The main exceptions are tf.Variable objects.) This means that you are encourage to use "functional" operations to define your tensor. For example, there are several ways to generate the weights tensor functionally:
Since weights is defined as an element-wise transformation of labels, you can use tf.map_fn() to create a new tensor (containing a tf.cond() to replace the if statement) by mapping a function across it:
def get_weights(labels, class_ratio=0.5):
pos_num = tf.constant(class_ratio * 100)
neg_num = tf.constant(100 - class_ratio * 100)
def compute_weight(x):
return tf.cond(tf.equal(x, 0), lambda: pos_num / neg_num, lambda: neg_num)
return tf.map_fn(compute_weight, labels, dtype=tf.float32)
This version allows you to apply an arbitrarily complicated function to each element of labels.
However, since the function is simple, cheap to compute, and representable using simple TensorFlow ops, you can avoid using tf.map_fn() and instead use tf.where():
def get_weights(labels, class_ratio=0.5):
pos_num = tf.fill(tf.shape(labels), class_ratio * 100)
neg_num = tf.fill(tf.shape(labels), 100 - class_ratio * 100)
return tf.where(tf.equal(labels, 0), pos_num / neg_num, neg_num)
(You could also use tf.where() instead of tf.cond() in the tf.map_fn() version.)

Applying element wise conditional functions on Theano TensorVariable

Easiest thing might be for me to just post the numpy code that I'm trying to perform directly in Theano if it's possible:
tensor = shared(np.random.randn(7, 16, 16)).eval()
tensor2 = tensor[0,:,:].eval()
tensor2[tensor2 < 1] = 0.0
tensor2[tensor2 > 0] = 1.0
new_tensor = [tensor2]
for i in range(1, tensor.shape[0]):
new_tensor.append(np.multiply(tensor2, tensor[i,:,:].eval()))
output = np.array(new_tensor).reshape(7,16,16)
If it's not immediately obvious, what I'm trying to do is use the values from one matrix of a tensor made up of 7 different matrices and apply that to the other matrices in the tensor.
Really, the problem I'm solving is doing conditional statements in an objective function for a fully convoltional network in Keras. Basically the loss for some of the feature map values is going to be calculated (and subsequently weighted) differently from others depending on some of the values in one of the feature maps.
You can easily implement conditionals with switch statement.
Here would be the equivalent code:
import theano
from theano import tensor as T
import numpy as np
def _check_new(var):
shape = var.shape[0]
t_1, t_2 = T.split(var, [1, shape-1], 2, axis=0)
ones = T.ones_like(t_1)
cond = T.gt(t_1, ones)
mask = T.repeat(cond, t_2.shape[0], axis=0)
out = T.switch(mask, t_2, T.zeros_like(t_2))
output = T.join(0, cond, out)
return output
def _check_old(var):
tensor = var.eval()
tensor2 = tensor[0,:,:]
tensor2[tensor2 < 1] = 0.0
tensor2[tensor2 > 0] = 1.0
new_tensor = [tensor2]
for i in range(1, tensor.shape[0]):
new_tensor.append(np.multiply(tensor2, tensor[i,:,:]))
output = theano.shared(np.array(new_tensor).reshape(7,16,16))
return output
tensor = theano.shared(np.random.randn(7, 16, 16))
out1 = _check_new(tensor).eval()
out2 = _check_old(tensor).eval()
print out1
print '----------------'
print ((out1-out2) ** 2).mean()
Note: since your masking on the first filter, I needed to use split and join operations.

Categories

Resources