I'm trying to use Theano to compute the hessian of a function with respect to a vector as well as a couple scalars (edit: that is, I essentially want the scalars appended to the vector that I am computing the hessian with respect to). Here's a minimal example:
import theano
import theano.tensor as T
A = T.vector('A')
b,c = T.scalars('b','c')
y = T.sum(A)*b*c
My first try was:
hy = T.hessian(y,[A,b,c])
Which fails with AssertionError: tensor.hessian expects a (list of) 1 dimensional variable as 'wrt'
My second try was to combine A, b, and c with:
wrt = T.concatenate([A,T.stack(b,c)])
hy = T.hessian(y,[wrt])
Which fails with DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: Join.0
What is the correct way to compute the hessian in this case?
Update: To clarify on what I am looking for, suppose A is a 2 element vector. Then the Hessian would be:
[[d2y/d2A1, d2y/dA1dA2, d2y/dA1dB, d2y/dA1dC],
[d2y/dA2dA1, d2y/d2A2, d2y/dA2dB, d2y/dA2dC],
[d2y/dBdA1, d2y/dBdA2, d2y/d2B, d2y/dABdC],
[d2y/dCdA1, d2y/dCdA2, d2y/dCdB, d2y/d2C]]
which for the example function y should be:
[[0, 0, C, B],
[0, 0, C, B],
[C, C, 0, A1+A2],
[B, B, A1+A2, 0]]
So if we were to define a function:
f = theano.function([A,b,c], hy)
then, assuming we could compute hy successfully, we would expect the output:
f([1,1], 4, 5) =
[[0, 0, 5, 4],
[0, 0, 5, 4],
[5, 5, 0, 2],
[4, 4, 2, 0]]
In my actual application, A has 25 elements and y is more complicated, but the idea is the same.
If you pass b,c as vectors, it should work. The hessian operator expects 1D arrays. Even though scalars should work, too, it is probably easiest to just provide the type of input it likes.
The reason why your stacking fails is that the stack operation yields a new, non-endnode variable on a different branch of the graph with respect to which you can't generally take derivatives explicitly. So theano simply doesn't permit this.
This works for me:
import theano.tensor as T
A = T.vector('A')
b,c = T.vectors('b','c')
y = T.sum(A)*b[0]*c[0]
hy = T.hessian(y,[A,b,c])
Based on a suggestion from #eickenberg to combine the inputs at the numpy level, I used the following workaround:
import theano
import theano.tensor as T
A,temp = T.vectors('A','T')
b,c = T.scalars('b','c')
y = T.sum(A)*b*c
y2 = theano.clone(y,{A:temp[:-2],b:temp[-2],c:temp[-1]})
hy = T.hessian(y2,[temp])
f = theano.function([temp], hy)
f([1,1,4,5])
gives the expected output:
> [array([[ 0., 0., 5., 4.],
> [ 0., 0., 5., 4.],
> [ 5., 5., 0., 2.],
> [ 4., 4., 2., 0.]])]
This works but feels rather awkward, if anyone knows of a better (more general) solution please let me know.
Related
numpy.vectorize takes a function f:a->b and turns it into g:a[]->b[].
This works fine when a and b are scalars, but I can't think of a reason why it wouldn't work with b as an ndarray or list, i.e. f:a->b[] and g:a[]->b[][]
For example:
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
print(g(a))
This yields:
array([[ 0. 0. 0. 0. 0.],
[ 1. 1. 1. 1. 1.],
[ 2. 2. 2. 2. 2.],
[ 3. 3. 3. 3. 3.]], dtype=object)
Ok, so that gives the right values, but the wrong dtype. And even worse:
g(a).shape
yields:
(4,)
So this array is pretty much useless. I know I can convert it doing:
np.array(map(list, a), dtype=np.float32)
to give me what I want:
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
but that is neither efficient nor pythonic. Can any of you guys find a cleaner way to do this?
np.vectorize is just a convenience function. It doesn't actually make code run any faster. If it isn't convenient to use np.vectorize, simply write your own function that works as you wish.
The purpose of np.vectorize is to transform functions which are not numpy-aware (e.g. take floats as input and return floats as output) into functions that can operate on (and return) numpy arrays.
Your function f is already numpy-aware -- it uses a numpy array in its definition and returns a numpy array. So np.vectorize is not a good fit for your use case.
The solution therefore is just to roll your own function f that works the way you desire.
A new parameter signature in 1.12.0 does exactly what you what.
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, signature='()->(n)')
Then g(np.arange(4)).shape will give (4L, 5L).
Here the signature of f is specified. The (n) is the shape of the return value, and the () is the shape of the parameter which is scalar. And the parameters can be arrays too. For more complex signatures, see Generalized Universal Function API.
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
b = g(a)
b = np.array(b.tolist())
print(b)#b.shape = (4,5)
c = np.ones((2,3,4))
d = g(c)
d = np.array(d.tolist())
print(d)#d.shape = (2,3,4,5)
This should fix the problem and it will work regardless of what size your input is. "map" only works for one dimentional inputs. Using ".tolist()" and creating a new ndarray solves the problem more completely and nicely(I believe). Hope this helps.
You want to vectorize the function
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
Assuming that you want to get single np.float32 arrays as result, you have to specify this as otype. In your question you specified however otypes=[np.ndarray] which means you want every element to be an np.ndarray. Thus, you correctly get a result of dtype=object.
The correct call would be
np.vectorize(f, signature='()->(n)', otypes=[np.float32])
For such a simple function it is however better to leverage numpy's ufunctions; np.vectorize just loops over it. So in your case just rewrite your function as
def f(x):
return np.multiply.outer(x, np.array([1,1,1,1,1], dtype=np.float32))
This is faster and produces less obscure errors (note however, that the results dtype will depend on x if you pass a complex or quad precision number, so will be the result).
I've written a function, it seems fits to your need.
def amap(func, *args):
'''array version of build-in map
amap(function, sequence[, sequence, ...]) -> array
Examples
--------
>>> amap(lambda x: x**2, 1)
array(1)
>>> amap(lambda x: x**2, [1, 2])
array([1, 4])
>>> amap(lambda x,y: y**2 + x**2, 1, [1, 2])
array([2, 5])
>>> amap(lambda x: (x, x), 1)
array([1, 1])
>>> amap(lambda x,y: [x**2, y**2], [1,2], [3,4])
array([[1, 9], [4, 16]])
'''
args = np.broadcast(None, *args)
res = np.array([func(*arg[1:]) for arg in args])
shape = args.shape + res.shape[1:]
return res.reshape(shape)
Let try
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
amap(f, np.arange(4))
Outputs
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
You may also wrap it with lambda or partial for convenience
g = lambda x:amap(f, x)
g(np.arange(4))
Note the docstring of vectorize says
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
Thus we would expect the amap here have similar performance as vectorize. I didn't check it, Any performance test are welcome.
If the performance is really important, you should consider something else, e.g. direct array calculation with reshape and broadcast to avoid loop in pure python (both vectorize and amap are the later case).
The best way to solve this would be to use a 2-D NumPy array (in this case a column array) as an input to the original function, which will then generate a 2-D output with the results I believe you were expecting.
Here is what it might look like in code:
import numpy as np
def f(x):
return x*np.array([1, 1, 1, 1, 1], dtype=np.float32)
a = np.arange(4).reshape((4, 1))
b = f(a)
# b is a 2-D array with shape (4, 5)
print(b)
This is a much simpler and less error prone way to complete the operation. Rather than trying to transform the function with numpy.vectorize, this method relies on NumPy's natural ability to broadcast arrays. The trick is to make sure that at least one dimension has an equal length between the arrays.
I am looking for an efficient method to sum floating-point variables into buckets specified by another tensor.
I specify efficient because the actual inputs and outputs are rather big. They fit into memory without an issue (millions of elements), but if we square the memory or computation complexity from what's necessary, we would run into problems.
Also, C-like solution that just loops over inputs and adds to array/hashmap is theoretically efficient, but results in terrible execution time in TF, I am seeking something that properly leverages multiprocessing. This requirement would usually boil down to not having non-parallelized loops over single array elements.
Example of the problem:
import tensorflow as tf
import numpy as np
buckets = 8
indices = tf.convert_to_tensor([0, 0, 0, 1, 1, 3, 3, 5], tf.int32)
values = tf.convert_to_tensor([.5, .3, .2, .1, 1., 1., 1., .1], tf.float32)
# Inefficient solution that adds a new dimension,
# materializes a dense tensor, and then sums along the added dimension
# memory and computation complexity is O(buckets x indices.size), that's absolutely terrible.
indices_new_axis = tf.range(buckets, dtype=tf.int64)
indices = tf.stack([tf.cast(indices, tf.int64), indices_new_axis], axis=-1)
sparse_repr = tf.SparseTensor(indices, values, dense_shape=[buckets, indices.shape[0]])
dense_repr = tf.sparse.to_dense(sparse_repr)
result = tf.reduce_sum(dense_repr, axis=1)
print(result)
expected_result_dense = [1., 1.1, 0., 2., 0., .1, 0., 0.]
np.testing.assert_array_almost_equal(expected_result_dense, result.numpy())
# the same with sparse representation would also be good:
expected_indices_sparse = [0, 1, 3, 5]
expected_values_sparse = [1., 1.1, 2., .1]
Some technical background to the problem:
I am trying out Hough transform for some analytical shapes with confidence voting (without strictly thresholding the gradient, but still most points from original image are eliminated). With version without weights, I just use tf.bincount on indices and that works perfectly, I wonder if I can do something similar here.
I am aware that in this specific case, I can avoid the problem altogether by accumulating results while iterating through each lit pixel in the original image (there will be no duplicates, so I can materialize votes as buckets sized dense tensors and add them to the accumulator), but that's way less efficient than what should be possible.
One day later, I finally found the solution. It's... well... rather simple:
import tensorflow as tf
import numpy as np
buckets = 8
indices = tf.convert_to_tensor([0, 0, 0, 1, 1, 3, 3, 5], tf.int32)
values = tf.convert_to_tensor([.5, .3, .2, .1, 1., 1., 1., .1], tf.float32)
result = tf.math.unsorted_segment_sum(values, indices, buckets)
print(result)
expected_result_dense = [1., 1.1, 0., 2., 0., .1, 0., 0.]
np.testing.assert_array_almost_equal(expected_result_dense, result.numpy())
Turns out that there is already a function in TF 2.6 (and has been in TF for a while) that does exactly that. I was unable to find it since I was searching around tf.bincount, which is similar, but named completely differently.
On the way to the above solution I also found another that used tf.sparse.reduce_sum() (this required to change the data into a SparseTensor before), but above was much faster.
Using numpy, given a square matrix, A and column vector x, use np.linalg.solve to compute A^(−1)x.
The documentation provides a simple example
>>>a = np.array([[3,1], [1,2]])
>>>b = np.array([9,8])
>>>x = np.linalg.solve(a, b)
>>>x
array([ 2., 3.])
But I do not see how this problem relates and can be applied to the given problem to solve?
I am new in python coming from matlab. Now when i want to save a vector in matlab to a preallocated matrix i do this (matlab code)
a = zeros(5, 2)
b = zeros(5, 1)
# save elements of b in the first column of a
a(:, 1) = b
Now i am using numpy in python. I do not really know how to describe this problem. What am i doing here is essentially this
a = np.zeros([5, 2])
b = np.ones([5, 1])
a[:, 0] = np.reshape(b, a[:, 0].shape)
because the following solution is not working:
a[:, 0] = b # Not working
Can anyone point out other ways of doing it, more closely to the matlab style?
Simple way would be -
a[:,[0]] = b
Sample run -
In [217]: a = np.zeros([5, 2])
...: b = np.ones([5, 1])
...:
In [218]: a[:,[0]] = b
In [219]: a
Out[219]:
array([[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.],
[ 1., 0.]])
Basically with this slicing of using a scalar a[:,0], number of dimensions are reduced (the dimension along which the scalar is used is removed) for assignment. When we specify a list of index/indices like a[:,[0]], the dimensions are preserved, i.e. kept as 2D and that allows us to assign b, which is also 2D. Let's test that out -
In [225]: a[:,0].shape
Out[225]: (5,) # 1D array
In [226]: a[:,[0]].shape
Out[226]: (5, 1) # 2D array
In [227]: b.shape
Out[227]: (5, 1) # 2D array
For reference, here's a link to the slicing scheme. Quoting the relevant part from it -
An integer, i, returns the same values as i:i+1 except the
dimensionality of the returned object is reduced by 1.
In particular, a selection tuple with the p-th element an integer (and all other
entries :) returns the corresponding sub-array with dimension N - 1.
I am trying to get the ranks of a 2-d tensor in Tensorflow. I can do that in numpy using something like:
import numpy as np
from scipy.stats import rankdata
a = np.array([[0,3,2], [6,5,4]])
ranks = np.apply_along_axis(rankdata, 1, a)
And ranks is:
array([[ 1., 3., 2.],
[ 3., 2., 1.]])
My question is how can I do this in tensorflow?
import tensorflow as tf
a = tf.constant([[0,3,2], [6,5,4]])
sess = tf.InteractiveSession()
ranks = magic_function(a)
ranks.eval()
tf.nn.top_k would work for you, although it has slightly different semantics. Please read the documentation to know how to use it for your case. But here is the snippet to solve your example :
sess = tf.InteractiveSession()
a = tf.constant(np.array([[0,3,2], [6,5,4]]))
# tf.nn.top_k sorts in ascending order, so negate to switch the sense
_, ranks = tf.nn.top_k(-a, 3)
# top_k outputs 0 based indices, so add 1 to get the same
# effect as rankdata
ranks = ranks + 1
sess.run(ranks)
# output :
# array([[1, 3, 2],
# [3, 2, 1]], dtype=int32)