tensorflow matrix multiplication - python

So, i want to multiply a matrix with a matrix. When I try an array with a matrix, it works:
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 3])
W = tf.Variable(tf.ones([3, 3]))
y = tf.matmul(x, W)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
curr_y = sess.run(y, feed_dict={x: [[1,2,3],[0,4,5]]})
print curr_y
So the array has the batch size 2 and shape 3x1. So I can multiply the matrix with shape 3x3 with the array 3x1.
But when I have again a matrix with the shape 3x3, but this time a matrix and not an array with the shape 3x2, with batch size 2, its not working.
But if I try to multiply a matrix with a matrix. It doesn't work.
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 3,3])
W = tf.Variable(tf.ones([3, 3]))
y = tf.matmul(x, W)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
curr_y = sess.run(y, feed_dict={x: [[[1,2,3],[1,2,3]],[[1,1,4],[0,4,5]]]})
print curr_y
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op:
'MatMul') with input shapes: [?,3,3], [3,3].
########EDIT
Sorry, what I want to do, is, to matmul a matrix to a batch of matrix or arrays. So I dont want to do
y = tf.matmul(x, W)
actually, I want to do
y = tf.matmul(W, x)

Your input to tensor 'x' has a shape (2, 2, 3).
You're trying to do matrix multiplication of (2, 2, 3) and (3, 3). they don't have the same rank, and that's the reason for the error.
from Tensorflow official site:
https://www.tensorflow.org/api_docs/python/tf/matmul
Args:
a: Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.
b: Tensor with same type and rank as a.

When you do matrices multiplication, the shape of the matrices need to follow the rule
(a, b) * (b, c) = (a, c)
Keep in mind the shape of W as you defined is (3, 3).
This feed_dict={x: [[1,2,3],[0,4,5]]} is a 2D array, the shape of it is (2, 3)
In [67]: x = [[1, 2, 3], [0, 4, 5]]
In [68]: x = np.array(x)
In [69]: x.shape
Out[69]: (2, 3)
It follows the rule (2, 3) * (3, 3) => (2, 3)
But your second example, the shape doesn't follow the rule of multiplication. The shape of your input is (2, 2, 3) which is not even in the same dimension as your defined W, so it won't work.
In [70]: foo = [[[1,2,3],[1,2,3]],[[1,1,4],[0,4,5]]]
In [71]: foo = np.array(foo)
In [72]: foo.shape
Out[72]: (2, 2, 3)

Related

Getting the expected dimensions of the Jacobian with JAX?

I am trying to get the Jacobian for a simple parameterization function within JAX. The code is as follows:
# imports
import jax
import jax.numpy as jnp
from jax import random
# simple parameterization function
def reparameterize(v_params):
theta = v_params[0] + jnp.exp(v_params[1]) * eps
return theta
Suppose I initialize eps to be a vector of shape (3,) and v_params to be of shape (3, 2):
key = random.PRNGKey(2022)
eps = random.normal(key, shape=(3,))
key, _ = random.split(key)
v_params = random.normal(key, shape=(3, 2))
I want the Jacobian to be an array of shape (3, 2) but by using
jacobian(vmap(reparameterize))(v_params)
returns an array of shape (3, 3, 3, 2). If I re-initialize with only a single eps:
key, _ = random.split(key)
eps = random.normal(key, shape=(1, ))
key, _ = random.split(key)
v_params = random.normal(key, shape=(2, ))
and call jacobian(reparameterize)(v_params) I get what I want, e.g., an array of shape (2, ). Effectively looping over all eps and stacking the results of each Jacobian gives me the desired Jacobian (and shape). What am I missing here? Thanks for your help!
For a function f that maps an input of shape shape_in to an output of shape shape_out, the jacobian will have shape (*shape_out, *shape_in).
In your case, vmap(reparameterize) takes an array of shape (3, 2) and returns an array of shape (3, 3), so the output of the jacobian is an array of shape (3, 3, 3, 2).
It's hard to tell from your question what computation you were intending, but if you want a jacobian the same shape as the input, you need a function that maps the input to a scalar. Perhaps the sum is what you had in mind?
result = jacobian(lambda x: vmap(reparameterize)(x).sum())(v_params)
print(result.shape)
# (3, 2)

Element-wise multiply a dense vector with each row of a sparse matrix in Tensorflow

Suppose I have a sparse matrix
A = tf.sparse.SparseTensor(indices=[[0,0],[1,1],[1,2]], values=[1,1,1],
dense_shape=[2,3])
and a dense vector
B = tf.constant([4,3,5])
The shape of matrix A and vector B are (2,3), (1,3) respectively. I would like to element-wise multiply B with each row of A. The expected result is another sparse matrix, say
C = tf.sparse.SparseTensor(indices=[[0,0],[1,1],[1,2]], values=[4,3,5],
dense_shape=[2,3])
I know it would be relatively easy if A is a dense matrix, but the dense size of A is extremely large and most of the elements in A are zero.
Just multiplying with an asterisk * works.
tf.reduce_all(tf.sparse.to_dense(A * B) == tf.sparse.to_dense(C))
<tf.Tensor: shape=(), dtype=bool, numpy=True>
Btw, B has shape (3,), not (1, 3)
This is the result of this operation:
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[4, 0, 0],
[0, 3, 5]])>
You could also have done this manually, but keep an eye on the broadcasting rules:
tf.sparse.SparseTensor(indices=A.indices, values=A.values * B,
dense_shape=A.dense_shape)

Broadcast SparseTensor in tensorflow

I want to elementwise multiply a dense tensor with shape [n, n, k] with a sparse tensor that has the shape [n, n, 1]. I want the values from the sparse tensor to repeat along the axis with the size s, like it would do if I used a dense tensor instead and relied on implicit broadcasting.
However the SparseTensor.__mul__ operation does not support broadcasting the sparse operand. I didn't find an operator to explicitly broadcast the sparse Tensor. How could I achieve this?
If you do not want to just convert the sparse tensor to dense, you can extract select the right values from the dense tensor to build a sparse result directly, something like this:
import tensorflow as tf
import numpy as np
with tf.Graph().as_default(), tf.Session() as sess:
# Input data
x = tf.placeholder(tf.float32, shape=[None, None, None])
y = tf.sparse.placeholder(tf.float32, shape=[None, None, 1])
# Indices of sparse tensor without third index coordinate
indices2 = y.indices[:, :-1]
# Values of dense tensor corresponding to sparse tensor values
x_sp = tf.gather_nd(x, indices2)
# Values of the resulting sparse tensor
res_vals = tf.reshape(x_sp * tf.expand_dims(y.values, 1), [-1])
# Shape of the resulting sparse tensor
res_shape = tf.shape(x, out_type=tf.int64)
# Make sparse tensor indices
k = res_shape[2]
v = tf.size(y.values)
# Add third coordinate to existing sparse tensor coordinates
idx1 = tf.tile(tf.expand_dims(indices2, 1), [1, k, 1])
idx2 = tf.tile(tf.range(k), [v])
res_idx = tf.concat([tf.reshape(idx1, [-1, 2]), tf.expand_dims(idx2, 1)], axis=1)
# Make sparse result
res = tf.SparseTensor(res_idx, res_vals, res_shape)
# Dense value for testing
res_dense = tf.sparse.to_dense(res)
# Dense operation for testing
res_dense2 = x * tf.sparse.to_dense(y)
# Test
x_val = np.arange(48).reshape(4, 4, 3)
y_val = tf.SparseTensorValue([[0, 0, 0], [2, 3, 0], [3, 1, 0]], [1, 2, 3], [4, 4, 1])
res_dense_val, res_dense2_val = sess.run((res_dense, res_dense2),
feed_dict={x: x_val, y: y_val})
print(np.allclose(res_dense_val, res_dense2_val))
# True

ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [2], [2,3]

I'm so news to Tensorflow . I already search for same questions,but i can't understand. there is the code .Hope you can help me.
Code:
import tensorflow as tf
w1 = tf.Variable(tf.random_normal([2,3],stddev=1,seed=1))
w2 = tf.Variable(tf.random_normal([3,3],stddev=1,seed=1))
x = tf.constant([0.7,0.9])
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
sess = tf.Session()
sess.run(w1.initializer)
sess.run(w2.initializer)
print(sess.run(y))
sess.close()
The shape of constant x is (2,), i.e. a one-dimensional array, and you are trying to multiply it with a two-dimensional array w1 of shape (2, 3), which is not possible for matrix multiplication, as number of columns of first parameter must be equal to number of rows in second parameter. Also, I think tf.matmul only works if both arrays are two-dimensional.
One of the many ways you can change your declaration of x as
x = tf.constant([[0.7], [0.9]])
This will create a two-dimensional constant tensor of shape (2, 1). And, then multiply it as,
a = tf.matmul(tf.transpose(x), w1)
tf.transpose() is used to create transpose of array x with shape (2, 1) to shape (1, 2).
Hope this helps.
In your case, the rank of variable x is 1. Hence the issue.
Following is the reason you are having this issue.
Please refer the tensorflow API https://www.tensorflow.org/api_docs/python/tf/matmul
tf.matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False,
a_is_sparse=False, b_is_sparse=False, name=None)
Args:
a: Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.
b: Tensor with same type and rank as a.
The shape of x is (2,) does not match the shape (2,3) of w1.
You should change
x = tf.constant([0.7,0.9])
to
x = tf.constant([[0.7,0.9]])
now the shape of x is (1,2) and works fine.

Tensorflow placeholder declaration

I'm trying to convert a tutorial from Keras to TF.
I'm getting the following error:
Traceback (most recent call last):
File "/Users/spicyramen/Documents/Development/google/python/machine_learning/deep_learning/exercise1_tf.py", line 64, in <module>
sess.run(train_step, feed_dict=train_data)
File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/Library/Python/2.7/site-packages/tensorflow/python/client/session.py", line 975, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (768,) for Tensor u'Placeholder_1:0', which has shape '(?, 1)'
This seems to be related in how I'm passing the target labels and how my placeholder value is declared.
When I return the labels I have this:
>>> dataset[:, 8].shape
(768,)
>>> dataset[:, 0:8].shape
(768, 8)
Code
import tensorflow as tf
import numpy as np
print("Tensorflow version: " + tf.__version__)
tf.set_random_seed(0)
FILENAME = 'pima-indians-diabetes.csv'
_LEARNING_RATE = 0.003
_NUM_FEATURES = 8
_NUM_LABELS = 1
_NUM_EPOCHS = 150
_BATCH_SIZE = 10
def import_data(filename):
if filename:
dataset = np.loadtxt(filename, delimiter=",")
return dataset[:, 0:8], dataset[:, 8]
# create placeholder. Dataset contains _NUM_FEATURES features:
X = tf.placeholder(tf.float32, [None, _NUM_FEATURES])
Y_ = tf.placeholder(tf.float32,[None, _NUM_LABELS]) # Placeholder for correct answers
# weights and biases
W = tf.Variable(tf.random_normal([_NUM_FEATURES, _NUM_LABELS],
mean=0,
stddev=0.1,
name='weights'))
b = tf.Variable(tf.random_normal([1, _NUM_LABELS],
mean=0,
stddev=0.1,
name='bias'))
# activation function
Y = tf.nn.relu(tf.matmul(X, W) + b, name='activation')
# cost function i.e. sigmoid_cross_entropy_with_logits
cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=Y_, logits=Y, name='loss_function')
optimizer = tf.train.AdamOptimizer(_LEARNING_RATE) # Formal derivation
train_step = optimizer.minimize(cross_entropy)
# cost function i.e. RMSE
# cross_entropy = tf.nn.l2_loss(Y - Y_, name="squared_error_cost")
# optimizer = tf.train.GradientDescentOptimizer(_LEARNING_RATE)
# train_step = optimizer.minimize(cross_entropy)
is_correct = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# init
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for i in range(_NUM_EPOCHS):
# data
batch_X, batch_Y = import_data(FILENAME)
# train
train_data = {X: batch_X, Y_: batch_Y}
sess.run(train_step, feed_dict=train_data)
a, c = sess.run([accuracy, cross_entropy], feed_dict=train_data)
print(str(i) + ": accuracy:" + str(a) + " loss: " + str(c))
This is your problem here:
>>> dataset[:, 8].shape
(768,)
TensorFlow is expecting an array of shape (768,1) and not (768,) as the error references:
Cannot feed value of shape (768,) for Tensor u'Placeholder_1:0', which has shape '(?, 1)'
The difference between the two shapes is somewhat small and Numpy would normally broadcast these for you in many circumstances, but TF won't. See the difference between those two shapes in this question with a great answer.
Luckily in your case the solution is very simple. You can use np.expand_dims() to turn your (768,) vector into a (768,1) vector, as demonstrated here:
>>> np.array([5,5,5]).shape
(3,)
>>> np.expand_dims(np.array([5,5,5]), axis=1).shape
(3, 1)
In your import_data function, simply change the return line to
return dataset[:, 0:8], np.expand_dims(dataset[:, 8], axis=1)
Edit: I like the above because np.expand_dims is a little more explicit, but there's another way that is equally simple, and others might think it's more clear---just depends on what you're used to. Figured I'd include it for completeness. The difference between a (N,) and (N,1) array is the first holds values in a 0-dimensional array np.array([5, 5, 5]) while the second holds values in a 1-dimensional array np.array([[5],[5],[5]]). You can turn your 0-d array into a 1-d array by adding a bracket around it; but then it's a row and not a column, so it needs to be transposed. So here's the two suggested ways together; B is the new suggestion, C is the above suggestion:
>>> A = np.array([5,5,5])
>>> B = np.array([A]).T
>>> C = np.expand_dims(A, axis=1)
>>> A; A.shape
array([5, 5, 5])
(3,)
>>> B; B.shape
array([[5],
[5],
[5]])
(3, 1)
>>> C; C.shape
array([[5],
[5],
[5]])
(3, 1)
Edit2: Also TensorFlow itself has a tf.expand_dims() function.

Categories

Resources