Theano: Operate on nonzero elements of sparse matrix

Theano: Operate on nonzero elements of sparse matrix - python

I'm trying to take the exp of nonzero elements in a sparse theano variable. I have the current code:
A = T.matrix("Some matrix with many zeros")
A_sparse = theano.sparse.csc_from_dense(A)
I'm trying to do something that's equivalent to the following numpy syntax:
mask = (A_sparse != 0)
A_sparse[mask] = np.exp(A_sparse[mask])
but Theano doesn't support != masks yet. (And (A_sparse > 0) | (A_sparse < 0) doesn't seem to work either.)
How can I achieve this?

The support for sparse matrices in Theano is incomplete, so some things are tricky to achieve. You can use theano.sparse.structured_exp(A_sparse) in that particular case, but I try to answer your question more generally below.
Comparison
In Theano one would normally use the comparison operators described here: http://deeplearning.net/software/theano/library/tensor/basic.html
For example, instead of A != 0, one would write T.neq(A, 0). With sparse matrices one has to use the comparison operators in theano.sparse. Both operators have to be sparse matrices, and the result is also a sparse matrix:
mask = theano.sparse.neq(A_sparse, theano.sparse.sp_zeros_like(A_sparse))
Modifying a Subtensor
In order to modify part of a matrix, one can use theano.tensor.set_subtensor. With dense matrices this would work:
indices = mask.nonzero()
A = T.set_subtensor(A[indices], T.exp(A[indices]))
Notice that Theano doesn't have a separated boolean type—the mask is zeros and ones—so nonzero() has to be called first to take the indices of the nonzero elements. Furthermore, this is not implemented for sparse matrices.
Operating on Nonzero Sparse Elements
Theano provides sparse operations that are said to be structured and operate only on the nonzero elements. See:
http://deeplearning.net/software/theano/tutorial/sparse.html#structured-operation
More precisely, they operate on the data attribute of a sparse matrix, independent of the indices of the elements. Such operations are straightforward to implement. Note that the structured operations will operate on all the values in the data array, also those that are explicitly set to zero.

Here's a way of doing this with the scipy.sparse module. I don't know how theano implements its sparse. It's likely to be based on similar ideas (since it uses name like csc)
In [224]: A=sparse.csc_matrix([[1.,0,0,2,0],[0,0,3,0,0],[0,1,1,2,0]])
In [225]: A.A
Out[225]:
array([[ 1., 0., 0., 2., 0.],
[ 0., 0., 3., 0., 0.],
[ 0., 1., 1., 2., 0.]])
In [226]: A.data
Out[226]: array([ 1., 1., 3., 1., 2., 2.])
In [227]: A.data[:]=np.exp(A.data)
In [228]: A.A
Out[228]:
array([[ 2.71828183, 0. , 0. , 7.3890561 , 0. ],
[ 0. , 0. , 20.08553692, 0. , 0. ],
[ 0. , 2.71828183, 2.71828183, 7.3890561 , 0. ]])
The main attributes of the csc format at data, indices, indptr. It's possible that data has some 0 values if you fiddle with them after creation, but a freshly created matrix shouldn't.
The matrix also has a nonzero method modeled on the numpy one. In practice it converts the matrix to coo format, filters out any zero values, and returns the row and col attributes:
In [229]: A.nonzero()
Out[229]: (array([0, 0, 1, 2, 2, 2]), array([0, 3, 2, 1, 2, 3]))
And the csc format allows indexing just as a dense numpy array:
In [230]: A[A.nonzero()]
Out[230]:
matrix([[ 2.71828183, 7.3890561 , 20.08553692, 2.71828183,
2.71828183, 7.3890561 ]])

T.where works.
A_sparse = T.where(A_sparse == 0, 0, T.exp(A_sparse))
#Seppo Envari's answer seems faster though. So I'll accept his answer.

Related

Vectorize list returning python function into numpy nd-array [duplicate]

numpy.vectorize takes a function f:a->b and turns it into g:a[]->b[].
This works fine when a and b are scalars, but I can't think of a reason why it wouldn't work with b as an ndarray or list, i.e. f:a->b[] and g:a[]->b[][]
For example:
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
print(g(a))
This yields:
array([[ 0. 0. 0. 0. 0.],
[ 1. 1. 1. 1. 1.],
[ 2. 2. 2. 2. 2.],
[ 3. 3. 3. 3. 3.]], dtype=object)
Ok, so that gives the right values, but the wrong dtype. And even worse:
g(a).shape
yields:
(4,)
So this array is pretty much useless. I know I can convert it doing:
np.array(map(list, a), dtype=np.float32)
to give me what I want:
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
but that is neither efficient nor pythonic. Can any of you guys find a cleaner way to do this?

np.vectorize is just a convenience function. It doesn't actually make code run any faster. If it isn't convenient to use np.vectorize, simply write your own function that works as you wish.
The purpose of np.vectorize is to transform functions which are not numpy-aware (e.g. take floats as input and return floats as output) into functions that can operate on (and return) numpy arrays.
Your function f is already numpy-aware -- it uses a numpy array in its definition and returns a numpy array. So np.vectorize is not a good fit for your use case.
The solution therefore is just to roll your own function f that works the way you desire.

A new parameter signature in 1.12.0 does exactly what you what.
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, signature='()->(n)')
Then g(np.arange(4)).shape will give (4L, 5L).
Here the signature of f is specified. The (n) is the shape of the return value, and the () is the shape of the parameter which is scalar. And the parameters can be arrays too. For more complex signatures, see Generalized Universal Function API.

import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
b = g(a)
b = np.array(b.tolist())
print(b)#b.shape = (4,5)
c = np.ones((2,3,4))
d = g(c)
d = np.array(d.tolist())
print(d)#d.shape = (2,3,4,5)
This should fix the problem and it will work regardless of what size your input is. "map" only works for one dimentional inputs. Using ".tolist()" and creating a new ndarray solves the problem more completely and nicely(I believe). Hope this helps.

You want to vectorize the function
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
Assuming that you want to get single np.float32 arrays as result, you have to specify this as otype. In your question you specified however otypes=[np.ndarray] which means you want every element to be an np.ndarray. Thus, you correctly get a result of dtype=object.
The correct call would be
np.vectorize(f, signature='()->(n)', otypes=[np.float32])
For such a simple function it is however better to leverage numpy's ufunctions; np.vectorize just loops over it. So in your case just rewrite your function as
def f(x):
return np.multiply.outer(x, np.array([1,1,1,1,1], dtype=np.float32))
This is faster and produces less obscure errors (note however, that the results dtype will depend on x if you pass a complex or quad precision number, so will be the result).

I've written a function, it seems fits to your need.
def amap(func, *args):
'''array version of build-in map
amap(function, sequence[, sequence, ...]) -> array
Examples
--------
>>> amap(lambda x: x**2, 1)
array(1)
>>> amap(lambda x: x**2, [1, 2])
array([1, 4])
>>> amap(lambda x,y: y**2 + x**2, 1, [1, 2])
array([2, 5])
>>> amap(lambda x: (x, x), 1)
array([1, 1])
>>> amap(lambda x,y: [x**2, y**2], [1,2], [3,4])
array([[1, 9], [4, 16]])
'''
args = np.broadcast(None, *args)
res = np.array([func(*arg[1:]) for arg in args])
shape = args.shape + res.shape[1:]
return res.reshape(shape)
Let try
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
amap(f, np.arange(4))
Outputs
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
You may also wrap it with lambda or partial for convenience
g = lambda x:amap(f, x)
g(np.arange(4))
Note the docstring of vectorize says
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
Thus we would expect the amap here have similar performance as vectorize. I didn't check it, Any performance test are welcome.
If the performance is really important, you should consider something else, e.g. direct array calculation with reshape and broadcast to avoid loop in pure python (both vectorize and amap are the later case).

The best way to solve this would be to use a 2-D NumPy array (in this case a column array) as an input to the original function, which will then generate a 2-D output with the results I believe you were expecting.
Here is what it might look like in code:
import numpy as np
def f(x):
return x*np.array([1, 1, 1, 1, 1], dtype=np.float32)
a = np.arange(4).reshape((4, 1))
b = f(a)
# b is a 2-D array with shape (4, 5)
print(b)
This is a much simpler and less error prone way to complete the operation. Rather than trying to transform the function with numpy.vectorize, this method relies on NumPy's natural ability to broadcast arrays. The trick is to make sure that at least one dimension has an equal length between the arrays.

subsetting affects .view(np.float64) behaviour

I'm trying to use some sklearn estimators for classifications on the coefficients of some fast fourier transform (technically Discrete Fourier Transform). I obtain a numpy array X_c as output of np.fft.fft(X) and I want to transform it into a real numpy array X_r, with each (complex) column of the original X_c transformed into two (real/float) columns in X_r, i.e the shape goes from (r, c) to (r, 2c). So I use .view(np.float64). and it works at first.
The problem is that if I first decide to keep only some coefficients of the original complex array with X_c2 = X_c[:, range(3)] and then to do the same thing as before instead of having the number of columns doubled, I obtain the number of ranks doubled (the imaginary part of each element is put in a new row below the original).
I really don't understand why this happens.
To make myself clearer, here is a toy example:
import numpy as np
# I create a complex array
X_c = np.arange(8, dtype = np.complex128).reshape(2, 4)
print(X_c.shape) # -> (2, 4)
# I use .view to transform it into something real and it works
# the way I want it.
X_r = X_c.view(np.float64)
print(X_r.shape) # -> (2, 8)
# Now I subset the array.
indices_coef = range(3)
X_c2 = X_c[:, indices_coef]
print(X_c2.shape) # -> (2, 3)
X_r2 = X_c2.view(np.float64)
# In the next line I obtain (4, 3), when I was expecting (2, 6)...
print(X_r2.shape) # -> (4, 3)
Does anyone see a reason for this difference of behavior?

I get a warning:
In [5]: X_c2 = X_c[:,range(3)]
In [6]: X_c2
Out[6]:
array([[ 0.+0.j, 1.+0.j, 2.+0.j],
[ 4.+0.j, 5.+0.j, 6.+0.j]])
In [7]: X_c2.view(np.float64)
/usr/local/bin/ipython3:1: DeprecationWarning: Changing the shape of non-C contiguous array by
descriptor assignment is deprecated. To maintain
the Fortran contiguity of a multidimensional Fortran
array, use 'a.T.view(...).T' instead
#!/usr/bin/python3
Out[7]:
array([[ 0., 1., 2.],
[ 0., 0., 0.],
[ 4., 5., 6.],
[ 0., 0., 0.]])
In [12]: X_c2.strides
Out[12]: (16, 32)
In [13]: X_c2.flags
Out[13]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
So this copy (or is a view?) is Fortran order. The recommended X_c2.T.view(float).T produces the same 4x3 array without the warning.
As your first view shows, a complex array has the same data layout as twice the number of floats.
I've seen funny shape behavior when trying to view a structured array. I'm wondering the complex dtype is behaving much like a dtype('f8,f8') array.
If I change your X_c2 so it is a copy, I get the expected behavior
In [19]: X_c3 = X_c[:,range(3)].copy()
In [20]: X_c3.flags
Out[20]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
In [21]: X_c3.strides
Out[21]: (48, 16)
In [22]: X_c3.view(float)
Out[22]:
array([[ 0., 0., 1., 0., 2., 0.],
[ 4., 0., 5., 0., 6., 0.]])
That's reassuring. But I'm puzzled as to why the [:, range(3)] indexing creates a F order view. That should be advance indexing.
And indeed, a true slice does not allow this view
In [28]: X_c[:,:3].view(np.float64)
---------------------------------------------------------------------------
ValueError: new type not compatible with array.
So the range indexing has created some sort of hybrid object.

Numpy: signed values of element-wise absolute maximum of a 2D array

Let us assume that I have a 2D array named arr of shape (4, 3) as follows:
>>> arr
array([[ nan, 1., -18.],
[ -1., -1., -1.],
[ 1., 1., 5.],
[ 1., -1., 0.]])
Say that, I would like to assign the signed value of the element-wise absolute maximum of (1.0, 1.0, -15.0) and the rows arr[[0, 2], :] back to arr. Which means, I am looking for the output:
>>> arr
array([[ 1., 1., -18.],
[ -1., -1., -1.],
[ 1., 1., -15.],
[ 1., -1., 0.]])
The closest thing I found in the API reference for this is numpy.fmax but it doesn't do the absolute value. If I used:
arr[index_list, :] = np.fmax(arr[index_list, :], new_tuple)
my array would finally look like:
>>> arr
array([[ 1., 1., -15.],
[ -1., -1., -1.],
[ 1., 1., 5.],
[ 1., -1., 0.]])
Now, the API says that this function is
equivalent to np.where(x1 >= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting
I tried using the following:
arr[index_list, :] = np.where(np.absolute(arr[index_list, :]) >= np.absolute(new_tuple),
arr[index_list, :], new_tuple)
Although this produced the desired output, I got the warning:
/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevconsole.py:1: RuntimeWarning: invalid value encountered in greater_equal
I believe this warning is because of the NaN which is not handled gracefully here, unlike the np.fmax function. In addition, the API docs mention that np.fmax is faster and does broadcasting correctly (not sure what part of broadcasting is missing in the np.where version)
In conclusion, what I am looking for is something similar to:
arr[index_list, :] = np.fmax(arr[index_list, :], new_tuple, key=abs)
There is no such key attribute available to this function, unfortunately.
Just for context, I am interested in the fastest possible solution because my actual shape of the arr array is an average of (100000, 50) and I am looping through almost 1000 new_tuple tuples (with each tuple equal in shape to the number of columns in arr, of course). The index_list changes for each new_tuple.
Edit 1:
One possible solution is, to begin with replacing all NaN in arr with 0. i.e. arr[np.isnan(arr)] = 0. After this, I can use the np.where with np.absolute trick mentioned in my original text. However, this is probably a lot slower than np.fmax, as suggested by the API.
Edit 2:
The index_list may have repeated indexes in subsequent loops. Every new_tuple comes with a corresponding rule and the index_list is selected based on that rule. There is nothing stopping different rules from having overlapping indexes that they match to. #Divakar has an excellent answer for the case where index_list has no repeats. Other solutions are however welcome covering both cases.

Assuming that list of all index_list has no repeated indexes:
Approach #1
I would propose more of a vectorized solution once we have all of index_lists and new_tuples stored in one place, preferably as a list. As such this could be the preferred one, if we are dealing with lots of such tuples and lists.
So, let's say we have them stored as the following :
new_tuples = [(1.0, 1.0, -15.0), (6.0, 3.0, -4.0)] # list of all new_tuple
index_lists =[[0,2],[4,1,6]] # list of all index_list
The solution thereafter would be to manually repeat, replacing the broadcasting and then use np.where as shown later on in the question. Using np.where on the concern around the said warning, we can ignore, if the new_tuples have non-NaN values. Thus, the solution would be -
idx = np.concatenate(index_lists)
lens = list(map(len,index_lists))
a = arr[idx]
b = np.repeat(new_tuples,lens,axis=0)
arr[idx] = np.where(np.abs(a) > np.abs(b), a, b)
Approach #2
Another approach would be to store the absolute values of arr beforeand : abs_arr = np.abs(arr) and using those within np.where. This should save a lot time within the loop. Thus, the relevant computation would reduce to :
arr[index_list, :] = np.where(abs_arr[index_list, :] > np.abs(b), a, new_tuple)

Efficient incremental sparse matrix in python / scipy / numpy

Is there a way in Python to have an efficient incremental update of sparse matrix?
H = lil_matrix((n,m))
for (i,j) in zip(A,B):
h(i,j) += compute_something
It seems that such a way to build a sparse matrix is quite slow (lil_matrix is the fastest sparse matrix type for that).
Is there a way (like using dict of dict or other kind of approaches) to efficiently build the sparse matrix H?

In https://stackoverflow.com/a/27771335/901925 I explore incremental matrix assignment.
lol and dok are the recommended formats if you want to change values. csr will give you an efficiency warning, and coo does not allow indexing.
But I also found that dok indexing is slow compared to regular dictionary indexing. So for many changes it is better to build a plain dictionary (with the same tuple indexing), and build the dok matrix from that.
But if you can calculate the H data values with a fast numpy vector operation, as opposed to iteration, it is best to do so, and construct the sparse matrix from that (e.g. coo format). In fact even with iteration this would be faster:
h = np.zeros(A.shape)
for k, (i,j) in enumerate(zip(A,B)):
h[k] = compute_something
H = sparse.coo_matrix((h, (A, B)), shape=(n,m))
e.g.
In [780]: A=np.array([0,1,1,2]); B=np.array([0,2,2,1])
In [781]: h=np.zeros(A.shape)
In [782]: for k, (i,j) in enumerate(zip(A,B)):
h[k] = i+j+k
.....:
In [783]: h
Out[783]: array([ 0., 4., 5., 6.])
In [784]: M=sparse.coo_matrix((h,(A,B)),shape=(4,4))
In [785]: M
Out[785]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [786]: M.A
Out[786]:
array([[ 0., 0., 0., 0.],
[ 0., 0., 9., 0.],
[ 0., 6., 0., 0.],
[ 0., 0., 0., 0.]])
Note that the (1,2) value is the sum 4+5. That's part of the coo to csr conversion.
In this case I could have calculated h with:
In [791]: A+B+np.arange(A.shape[0])
Out[791]: array([0, 4, 5, 6])
so there's no need for iteration.

Nope, do not use csr_matrix or csc_matrix, as they are going to be even more slower than lil_matrix, if you construct them incrementally. The Dictionary of Key based sparse matrix is exactly what you are looking for
from scipy.sparse import dok_matrix
S = dok_matrix((5, 5), dtype=np.float32)
for i in range(5):
for j in range(5):
S[i,j] = i+j # Update elements

A faster way would be:
H_ij = compute_something_vectorized()
H = coo_matrix((H_ij, (A, B))).tocsr()
The data for duplicate coordinates are then summed, see the docs for coo_matrix.

Create a numpy array according to another array along with indices array

I have a numpy array(eg., a = np.array([ 8., 2.])), and another array which stores the indices I would like to get from the former array. (eg., b = np.array([ 0., 1., 1., 0., 0.]).
What I would like to do is to create another array from these 2 arrays, in this case, it should be: array([ 8., 2., 2., 8., 8.])
of course, I can always use a for loop to achieve this goal:
for i in range(5):
c[i] = a[b[i]]
I wonder if there is a more elegant method to create this array. Something like c = a[b[0:5]] (well, this apparently doesn't work)

Only integer arrays can be used for indexing, and you've created b as a float64 array. You can get what you're looking for if you explicitly convert to integer:
bi = np.array(b, dtype=int)
c = a[bi[0:5]]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.