As the title states, I'm trying to extract the highest n elements per row from a matrix in tensorflow, and store the result in a sparse Tensor.
I've been able to extract the indices and values with tf.nn.top_n, but the indices don't follow the convention required by tf.SparseTensor.
Specifically, tf.nn.top_n returns a matrix of col indices with the same shape as the resulting value matrix (Rows x n), whereas tf.SparseTensor wants a (# non-zero x 2) matrix with 1 row per non-zero element and the columns holding the row and col indices.
The values can an analogous problem whereby a list of non-zero elements is desired instead of a matrix of values.
How can I quickly convert between these indexing notation schemes?
This is doable with a bit of modular arithmetic. Here's an example that works on matrices, although it would be possible to loop over more axes.
import tensorflow as tf
def slices_to_dims(slice_indices):
"""
Args:
slice_indices: An [N, k] Tensor mapping to column indices.
Returns:
An index Tensor with shape [N * k, 2], corresponding to indices suitable for
passing to SparseTensor.
"""
slice_indices = tf.cast(slice_indices, tf.int64)
num_rows = tf.shape(slice_indices, out_type=tf.int64)[0]
row_range = tf.range(num_rows)
item_numbers = slice_indices * num_rows + tf.expand_dims(row_range, axis=1)
item_numbers_flat = tf.reshape(item_numbers, [-1])
return tf.stack([item_numbers_flat % num_rows,
item_numbers_flat // num_rows], axis=1)
Example usage:
dense_shape = [5, 7]
dense_matrix = tf.random_normal(shape=dense_shape)
top_values, top_indices = tf.nn.top_k(dense_matrix, k=2)
sparse_indices = slices_to_dims(top_indices)
sparse_tensor = tf.sparse_reorder(tf.SparseTensor(
indices=sparse_indices,
values=tf.reshape(top_values, [-1]),
dense_shape=dense_shape))
densified_top = tf.sparse_tensor_to_dense(sparse_tensor)
with tf.Session() as session:
sparse_top, dense_original, dense_selected = session.run(
[sparse_tensor, dense_matrix, densified_top])
print(dense_original)
print(dense_selected)
print(sparse_top)
Prints:
[[ 1.44056129 -1.01790774 -0.2795608 2.34854746 -2.27528405 -0.62035948
3.36598897]
[ 0.7114948 -0.42564821 -0.93446779 -0.25373486 -0.51730365 0.72331643
-0.75625718]
[-0.6501748 -0.92748415 -0.95409006 -0.07157528 0.80637723 -0.32177576
-1.4516511 ]
[-1.081038 -0.67226124 -1.19455576 0.44537872 -0.69019234 -0.61539739
0.15328468]
[ 0.43032476 -0.11295394 0.83491379 -0.67906654 0.20325914 -0.0155068
0.52107805]]
[[ 0. 0. 0. 2.34854746 0. 0.
3.36598897]
[ 0.7114948 0. 0. 0. 0. 0.72331643
0. ]
[ 0. 0. 0. -0.07157528 0.80637723 0. 0. ]
[ 0. 0. 0. 0.44537872 0. 0.
0.15328468]
[ 0. 0. 0.83491379 0. 0. 0.
0.52107805]]
SparseTensorValue(indices=array([[0, 3],
[0, 6],
[1, 0],
[1, 5],
[2, 3],
[2, 4],
[3, 3],
[3, 6],
[4, 2],
[4, 6]]), values=array([ 2.34854746, 3.36598897, 0.7114948 , 0.72331643, -0.07157528,
0.80637723, 0.44537872, 0.15328468, 0.83491379, 0.52107805], dtype=float32), dense_shape=array([5, 7]))
Related
I am trying to calculate the inverse matrix using the Gauss-Jordan Method. For that, I need to find the solution X to A.X = I (A and X being N x N matrices, and I the identity matrix).
However, for every column vector of the solution matrix X I calculate in the first loop, I have to use the original matrix A, but I don't know why it keeps changing when I did a copy of it in the beginning.
def SolveGaussJordanInvMatrix(A):
N = len(A[:,0])
I = np.identity(N)
X = np.zeros([N,N], float)
A_orig = A.copy()
for m in range(N):
x = np.zeros(N, float)
v = I[:,m]
A = A_orig
for p in range(N): # Gauss-Jordan Elimination
A[p,:] /= A[p,p]
v[p] /= A[p,p]
for i in range(p): # Cancel elements above the diagonal element
v[i] -= v[p] * A[i,p]
A[i,p:] -= A[p,p:]*A[i,p]
for i in range(p+1, N): # Cancel elements below the diagonal element
v[i] -= v[p] * A[i,p]
A[i,p:] -= A[p,p:]*A[i,p]
X[:,m] = v # Add column vector to the solution matrix
return X
A = np.array([[2, 1, 4, 1 ],
[3, 4, -1, -1],
[1, -4, 7, 5],
[2, -2, 1, 3]], float)
SolveGaussJordanInvMatrix(A)
Does anyone know how turn A back to its original form after the Gauss-Elimination loop?
I'm getting
array([[ 228.1, 0. , 0. , 0. ],
[-219.9, 1. , 0. , 0. ],
[ -14.5, 0. , 1. , 0. ],
[-176.3, 0. , 0. , 1. ]])
and expect
[[ 1.36842105 -0.89473684 -1.05263158 1. ]
[-1.42105263 1.23684211 1.13157895 -1. ]
[ 0.42105263 -0.23684211 -0.13157895 -0. ]
[-2. 1.5 1.5 -1. ]]
I've been trying to create a watershed algorithm and as all the examples seem to be in Python I've run into a bit of a wall. I've been trying to find in numpy documentation what this line means:
matrixVariable[A==255] = 0
but have had no luck. Could anyone explain what that operation does?
For context the line in action: label [lbl == -1] = 0
The expression A == 255 creates a boolean array which is True where x == 255 in A and False otherwise.
The expression matrixVariable[A==255] = 0 sets each index corresponding to a True value in A == 255 to 0.
EG:
import numpy as np
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
B = np.zeros([3, 3])
print('before:')
print(B)
B[A>5] = 5
print('after:')
print(B)
OUT:
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
after:
[[ 0. 0. 0.]
[ 0. 0. 5.]
[ 5. 5. 5.]]
I assumed that matrixVariable and A are numpy arrays. If the assumption is correct then "matrixVariable[A==255] = 0" expression first gets the index of the array A where values of A are equal to 255 then gets the values of matrixVariable for those index and set them to "0"
Example:
import numpy as np
matrixVariable = np.array([(1, 3),
(2, 2),
(3,1)])
A = np.array([255, 1,255])
So A[0] and A[2] are equal to 255
matrixVariable[A==255]=0 #then sets matrixVariable[0] and matrixVariable[2] to zero
print(matrixVariable) # this would print
[[0 0]
[2 2]
[0 0]]
I have a 3-D tensor of shape [batch, None, dim] where the second dimension, i.e. the timesteps, is unknown. I use dynamic_rnn to process such input, like in the following snippet:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
Actually, running this snipped with some actual numbers, I have some reasonable results:
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
print(output_)
And the output is:
[[[ 0. 0. 0. 0. ]
[ 0.02188676 -0.01294564 0.05340237 -0.47148666]
[ 0.0343586 -0.02243731 0.0870839 -0.89869428]
[ 0. 0. 0. 0. ]]
[[ 0.00284752 -0.00315077 0.00108094 -0.99883419]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
Is there a way to get a 3-D tensor of shape [batch, 1, hidden] with the last relevant output of the dynamic RNN? Thanks!
This is what gather_nd is for!
def extract_axis_1(data, ind):
"""
Get specified elements along the first axis of tensor.
:param data: Tensorflow tensor that will be subsetted.
:param ind: Indices to take (one for each element along axis 0 of data).
:return: Subsetted tensor.
"""
batch_range = tf.range(tf.shape(data)[0])
indices = tf.stack([batch_range, ind], axis=1)
res = tf.gather_nd(data, indices)
return res
In your case:
output = extract_axis_1(output, lengths - 1)
Now output is a tensor of dimension [batch_size, num_cells].
From the following two sources,
http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
outputs, last_states = tf.nn.dynamic_rnn(
cell=cell,
dtype=tf.float64,
sequence_length=X_lengths,
inputs=X)
Or https://github.com/ageron/handson-ml/blob/master/14_recurrent_neural_networks.ipynb,
It is clear the last_states can be directly extracted from the SECOND output of the dynamic_rnn call. It will give you the last_states across all layers (in LSTM it is compsed from LSTMStateTuple) , while the outputs contains all the states in the last layer.
Okay — so, looks like there actually is an easier solution. As #Shao Tang and #Rahul mentioned, the preferred way to do this would be by accessing the final cell state. Here’s why:
If you look at the GRUCell source code (below), you’ll see that the “state” that the cell maintains is actually the hidden weights themselves. So, when the tf.nn.dynamic_rnn returns the final state, it is actually returning the final hidden weights that you are interested in. To prove this, I just tweaked your setup and got the results:
GRUCell Call (rnn_cell_impl.py):
def call(self, inputs, state):
"""Gated recurrent unit (GRU) with nunits cells."""
if self._gate_linear is None:
bias_ones = self._bias_initializer
if self._bias_initializer is None:
bias_ones = init_ops.constant_initializer(1.0, dtype=inputs.dtype)
with vs.variable_scope("gates"): # Reset gate and update gate.
self._gate_linear = _Linear(
[inputs, state],
2 * self._num_units,
True,
bias_initializer=bias_ones,
kernel_initializer=self._kernel_initializer)
value = math_ops.sigmoid(self._gate_linear([inputs, state]))
r, u = array_ops.split(value=value, num_or_size_splits=2, axis=1)
r_state = r * state
if self._candidate_linear is None:
with vs.variable_scope("candidate"):
self._candidate_linear = _Linear(
[inputs, r_state],
self._num_units,
True,
bias_initializer=self._bias_initializer,
kernel_initializer=self._kernel_initializer)
c = self._activation(self._candidate_linear([inputs, r_state]))
new_h = u * state + (1 - u) * c
return new_h, new_h
Solution:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, state = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_, state_ = sess.run([output, state], {inputs: inputs_, lengths: lengths_})
print (output_)
print (state_)
Output:
[[[ 0. 0. 0. 0. ]
[-0.24305521 -0.15512943 0.06614969 0.16873555]
[-0.62767833 -0.30741733 0.14819752 0.44313088]
[ 0. 0. 0. 0. ]]
[[-0.99152333 -0.1006391 0.28767768 0.76360202]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
[[-0.62767833 -0.30741733 0.14819752 0.44313088]
[-0.99152333 -0.1006391 0.28767768 0.76360202]]
For other readers who are working with the LSTMCell (another popular option), things work a little differently. The LSTMCell maintains the state in a different way - cell state is either a tuple or a concatenated version of the actual cell state and the hidden state. So, to access the final hidden weights, you could set (is_state_tuple to True) during cell-initialization, and the final state will be a tuple : (final cell state, final hidden weights). So, in this case,
_, (_, h) = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
will give you the final weights.
References:
c_state and m_state in Tensorflow LSTM
https://github.com/tensorflow/tensorflow/blob/438604fc885208ee05f9eef2d0f2c630e1360a83/tensorflow/python/ops/rnn_cell_impl.py#L308
https://github.com/tensorflow/tensorflow/blob/438604fc885208ee05f9eef2d0f2c630e1360a83/tensorflow/python/ops/rnn_cell_impl.py#L415
Actually, the solution was not that hard. I implemented the following code:
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(rnn_out, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
So, the full snippet would be the following:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(output, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs = sess.run([output, last], {inputs: inputs_, lengths: lengths_})
print 'RNN output:'
print(outputs[0])
print
print 'last relevant output:'
print(outputs[1])
And the output:
RNN output:
[[[ 0. 0. 0. 0. ]
[-0.06667092 -0.09284072 0.01098599 -0.03676109]
[-0.09101103 -0.19828682 0.03546784 -0.08721405]
[ 0. 0. 0. 0. ]]
[[-0.00025157 -0.05704876 0.05527233 -0.03741353]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
last relevant output:
[[[-0.09101103 -0.19828682 0.03546784]]
[[-0.00025157 -0.05704876 0.05527233]]]
I have several sparse vectors represented as lists of tuples eg.
[[(22357, 0.6265631775164965),
(31265, 0.3900572375543419),
(44744, 0.4075397480094991),
(47751, 0.5377595092643747)],
[(22354, 0.6265631775164965),
(31261, 0.3900572375543419),
(42344, 0.4075397480094991),
(47751, 0.5377595092643747)],
...
]
And my goal is to compose scipy.sparse.csr_matrix from several millions of vectors like this.
I would like to ask if there exists some simple elegant solution for this kind of conversion without trying to stuck everything to memory.
EDIT:
Just a clarification: My goal is to build the 2d matrix, where each of my sparse vectors represent one row in matrix.
Collecting indices,data into a structured array avoids the integer-double conversion issue. It is also a bit faster than the vstack approach (in limited testing) (With list data like this np.array is faster than np.vstack.)
indptr = np.cumsum([0]+[len(i) for i in vectors])
aa = np.array(vectors,dtype='i,f').flatten()
A = sparse.csr_matrix((aa['f1'], aa['f0'], indptr))
I substituted the list comprehension for map since I'm using Python3.
Indicies in the coo format (data, (i,j)) might be more intuitive
ii = [[i]*len(v) for i,v in enumerate(vectors)])
ii = np.array(ii).flatten()
aa = np.array(vectors,dtype='i,f').flatten()
A2 = sparse.coo_matrix((aa['f1'],(np.array(ii), aa['f0'])))
# A2.tocsr()
Here, ii from the 1st step is the row numbers for each sublist.
[[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
...]]
This construction method is slower than the csr direct indptr.
For a case where there are differing numbers of entries per row, this approach works (using intertools.chain to flatten lists):
A sample list (no empty rows for now):
In [779]: vectors=[[(1, .12),(3, .234),(6,1.23)],
[(2,.222)],
[(2,.23),(1,.34)]]
row indexes:
In [780]: ii=[[i]*len(v) for i,v in enumerate(vectors)]
In [781]: ii=list(chain(*ii))
column and data values pulled from tuples and flattened
In [782]: jj=[j for j,_ in chain(*vectors)]
In [783]: data=[d for _,d in chain(*vectors)]
In [784]: ii
Out[784]: [0, 0, 0, 1, 2, 2]
In [785]: jj
Out[785]: [1, 3, 6, 2, 2, 1]
In [786]: data
Out[786]: [0.12, 0.234, 1.23, 0.222, 0.23, 0.34]
In [787]: A=sparse.csr_matrix((data,(ii,jj))) # coo style input
In [788]: A.A
Out[788]:
array([[ 0. , 0.12 , 0. , 0.234, 0. , 0. , 1.23 ],
[ 0. , 0. , 0.222, 0. , 0. , 0. , 0. ],
[ 0. , 0.34 , 0.23 , 0. , 0. , 0. , 0. ]])
Consider the following:
import numpy as np
from scipy.sparse import csr_matrix
vectors = [[(22357, 0.6265631775164965),
(31265, 0.3900572375543419),
(44744, 0.4075397480094991),
(47751, 0.5377595092643747)],
[(22354, 0.6265631775164965),
(31261, 0.3900572375543419),
(42344, 0.4075397480094991),
(47751, 0.5377595092643747)]]
indptr = np.cumsum([0] + map(len, vectors))
indices, data = np.vstack(vectors).T
A = csr_matrix((data, indices.astype(int), indptr))
Unfortunately, this way the column indices are converted from integers to doubles and back. This works correctly for up to very large matrices, but is not ideal.
I searched the net to find a guide for Scipy sparse matrices and I failed. I would be happy if anybody would share any source for it but now going to question:
I have an array of tuples. I want to change the array of tuples to a sparse matrix where the tuples appear on the main diagonal and diagonal just beside to it as the following example shows it. What is the fancy(efficient) way of doing it?
import numpy as np
A=np.asarray([[1,2],[3,4],[5,6],[7,8]])
B=np.zeros((A.shape[0],A.shape[0]+1))
for i in range(A.shape[0]):
B[i,i]=A[i,0]
B[i,i+1]=A[i,1]
print B
Output being:
[[ 1. 2. 0. 0. 0.]
[ 0. 3. 4. 0. 0.]
[ 0. 0. 5. 6. 0.]
[ 0. 0. 0. 7. 8.]]
You can build those really fast as a CSR matrix:
>>> A = np.asarray([[1,2],[3,4],[5,6],[7,8]])
>>> rows = len(A)
>>> cols = rows + 1
>>> data = A.flatten() # we want a copy
>>> indptr = np.arange(0, len(data)+1, 2) # 2 non-zero entries per row
>>> indices = np.repeat(np.arange(cols), [1] + [2] * (cols-2) + [1])
>>> import scipy.sparse as sps
>>> a_sps = sps.csr_matrix((data, indices, indptr), shape=(rows, cols))
>>> a_sps.A
array([[1, 2, 0, 0, 0],
[0, 3, 4, 0, 0],
[0, 0, 5, 6, 0],
[0, 0, 0, 7, 8]])
Try diags from scipy
import numpy as np
import scipy.sparse
A = np.asarray([[1,2],[3,4],[5,6],[7,8]])
B = scipy.sparse.diags([A[:,0], A[:,1]], [0, 1], [4, 5])
When I print B.todense(), it gives me
[[ 1. 2. 0. 0. 0.]
[ 0. 3. 4. 0. 0.]
[ 0. 0. 5. 6. 0.]
[ 0. 0. 0. 7. 8.]]