This is an implementation of logistic regression, using a toy data set. Some feedback from #dermen helped me fix a basic problem with how I was using scipy.optimize.minimize but even after fixing that issue, optimize fails to converge, even just using the first five rows of the test data set. Here is a stand-along version of the code:
import numpy as np
from scipy.optimize import minimize
# `data` is a subset of a toy dataset. The full dataset is ~100 rows, linearly seperable and located at
# https://github.com/liavkoren/ng-redux/blob/master/ex2/ex2data1.txt
data = np.array([
[ 34.62365962, 78.02469282],
[ 30.28671077, 43.89499752],
[ 35.84740877, 72.90219803],
[ 60.18259939, 86.3085521 ],
[ 79.03273605, 75.34437644],
[ 45.08327748, 56.31637178],
[ 61.10666454, 96.51142588],
[ 75.02474557, 46.55401354],
[ 76.0987867, 87.42056972],
[ 84.43281996, 43.53339331],
])
# Ground truth
y = np.array([0., 0., 0., 1., 1., 0., 1., 1., 1., 1.])
def sigmoid(z):
return 1/(1 + np.power(np.e, -z))
h = lambda theta, x: sigmoid(x.dot(theta))
def cost(theta, X, y):
m = X.shape[0]
j = y.dot(np.log(h(theta, X))) + (1 - y).dot(np.log(1 - h(theta, X)))
return (-j/m)
def grad(theta, X, y):
m = X.shape[0]
return ((h(theta, X) - y).dot(X))/m
# Add a column of ones:
m, features = np.shape(X_initial)
features += 1
X = np.concatenate([np.ones((m, 1)), X_initial], axis=1)
initial_theta = np.zeros((features))
def check_functions(grad_func, cost_func):
'''
Asserts that the cost and gradient functions return known corret values for a given theta, X, y.
Test case from https://www.coursera.org/learn/machine-learning/discussions/weeks/3/threads/tA3ESpq0EeW70BJZtLVfGQ
The expected cost is 4.6832.
The expected gradient = [0.31722, 0.87232, 1.64812, 2.23787]
'''
test_X = np.array([[1, 8, 1, 6], [1, 3, 5, 7], [1, 4, 9, 2]]) # X
test_y = np.array([[1, 0, 1]]) # y
test_theta = np.array([-2, -1, 1, 2])
grad_diff = grad_func(test_theta, test_X, test_y) - np.array([0.31722, 0.87232, 1.64812, 2.23787])
assert grad_diff.dot(grad_diff.T) < 0.0001
assert abs(cost_func(test_theta, test_X, test_y, debug=False) - 4.6832) < 0.0001
check_functions(grad, cost)
# `cutoff` slices out a subset of rows.
cutoff = 2
print minimize(fun=cost, x0=initial_theta, args=(X[0:cutoff, :], y[0:cutoff]), jac=grad)
This code fails with:
fun: nan
hess_inv: array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
jac: array([ 0., 0., 0.])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 32
nit: 1
njev: 32
status: 2
success: False
x: array([ -0.5 , -16.2275926 , -30.47992258])
/Users/liavkoren/Envs/data-sci/lib/python2.7/site-packages/ipykernel/__main__.py:25: RuntimeWarning: overflow encountered in power
/Users/liavkoren/Envs/data-sci/lib/python2.7/site-packages/ipykernel/__main__.py:38: RuntimeWarning: divide by zero encountered in log
/Users/liavkoren/Envs/data-sci/lib/python2.7/site-packages/ipykernel/__main__.py:42: RuntimeWarning: divide by zero encountered in log
There was overflow occurring in the calls to np.power inside the sigma function. I added debugging messages into the cost function and saw the following:
theta: [ 0. 0. 0.]
--
X: [[ 1. 34.62365962 78.02469282]
[ 1. 30.28671077 43.89499752]]
--
y=1: [ 0.5 0.5] y=0: [ 0.5 0.5]
log probabilities:
y=1: [-0.69314718 -0.69314718]
y=0: [-0.69314718 -0.69314718]
=======
theta: [ -0.5 -16.2275926 -30.47992258]
--
X: [[ 1. 34.62365962 78.02469282]
[ 1. 30.28671077 43.89499752]]
--
y=1: [ 0. 0.] y=0: [ 1. 1.]
log probabilities:
y=1: [-inf -inf]
y=0: [ 0. 0.]
This overflows on the second iteration!!
I quickly confirmed that this does seem to be to problem by dividing the dataset by 1/10 and it converged. I guess I will have to look at feature scaling/normalization or some other strategies for avoiding overflow.
Related
I am attempting to find a solution to a reliability engineering nonlinear problem with only a single unknown. This should be relatively easy; however, the problem I have come across is within the summation part of the equation. My research so far leads me to believe that it is not possible to run a for loop within a solver in Python.
The equation I am trying to solve is:
Equation
The equation considers the outcomes of a simple reliability test. Some of the equation variables are included within lists while others are single scalars. The equation variables are:
d = the "fix effectiveness factor", the fraction of a failure mode's failure rate that will be mitigated by corrective action
v = the time during the test that the corrective action was applied
n = the number of failures observed during the test for each failure mode
m = the total number of failure modes surfaced
T = the total test time (in this case 25 units tested for 175 hours = 4375 hours).
I am trying to solve the equation for beta_hat. I have used other methods/software to solve the equation and I know that beta_hat = 0.0016, however, I need to solve this equation using Python as that is what I used for all other code.
The lists holding the values for each equation element are:
d = [0.5, 0., 0., 0., 0.8, 0., 0.7, 0.8, 0.5, 0.5, 0.5, 0., 0.]
v = [4375., 0., 0., 0., 4375., 0., 4375., 2500., 4375., 4375., 4375., 0., 0.]
n = [1, 3, 16, 2, 1, 4, 1, 3, 1, 1, 1, 8, 1]
m = 13
T = 4375
I have unsuccessfully tried to use scipy.optimize (fsolve, root, least_squares), but I'm starting to run out of ideas and I think the problem may be that I can't run a for loop in a solver in Python like this example:
def f(x):
for i in range(m):
((1 + x * T) / (x**2 * T)) * math.log(1 + x * T) * np.sum(n[i] / (1 / x + v[i] + (1 - d[i]) * (T - v[i]))) - m
return f
result = optimize.root(f, 0.01)
print(result)
Any suggestions/ideas on how I might tackle this problem? Could there be a way I'm missing to run the for loop outside of the solver?
Given Tim's comments and feedback, I modified my code and checked the answer using both solve and root. Both efforts delivered the same result (as expected).
from math import log
from scipy.optimize import solve, root
d = [0.5, 0., 0., 0., 0.8, 0., 0.7, 0.8, 0.5, 0.5, 0.5, 0., 0.]
v = [4375., 0., 0., 0., 4375., 0., 4375., 2500., 4375., 4375., 4375., 0., 0.]
n = [1, 3, 16, 2, 1, 4, 1, 3, 1, 1, 1, 8, 1]
m = 13
T = 4375
def f(x):
c = 0.0
for i in range(m):
c += n[i] / (1.0 / x + v[i] + (1.0 - d[i]) * (T - v[i]))
return ((1.0 + x * T) / (x**2 * T)) * log(1.0 + x * T) * c - m
which results in
>>> fsolve(f, 0.001)
array([0.00163344])
>>> root(f, 0.001)
x: array([0.00163344])
As I mentioned in a comment, your f() doesn't compute a number for the solver to use. Nor does it appear to implement the equation you linked to. Here's a complete program, including a made-up toy solver:
def bisect(f, lo, hi, numiters):
assert f(lo) < 0.0
assert f(hi) > 0.0
for _ in range(numiters):
mid = lo + (hi - lo) / 2.0
y = f(mid)
if y < 0.0:
lo = mid
else:
hi = mid
return lo
d = [0.5, 0., 0., 0., 0.8, 0., 0.7, 0.8, 0.5, 0.5, 0.5, 0., 0.]
v = [4375., 0., 0., 0., 4375., 0., 4375., 2500., 4375., 4375., 4375., 0., 0.]
n = [1, 3, 16, 2, 1, 4, 1, 3, 1, 1, 1, 8, 1]
m = 13
T = 4375
def f(x):
from math import log
s = 0.0
for i in range(m):
s += n[i] / (1.0 / x + v[i] + (1.0 - d[i]) * (T - v[i]))
return s * ((1.0 + x * T) / (x**2 * T)) * log(1.0 + x * T) - m
Given that:
>>> f(5)
-12.979641343531911
>>> f(0.001)
3.975686546173577
>>> bisect(f, 5, 0.001, 53)
0.0016334432920776716
which appears to match the 0.0016 you expect the answer to be.
So think about all that: the problem almost certainly isn't in the solvers you're trying to use, but in the function you're passing to them.
I am trying to compare several datasets and basically test, if they show the same feature, although this feature might be shifted, reversed or attenuated.
A very simple example below:
A = np.array([0., 0, 0, 1., 2., 3., 4., 3, 2, 1, 0, 0, 0])
B = np.array([0., 0, 0, 0, 0, 1, 2., 3., 4, 3, 2, 1, 0])
C = np.array([0., 0, 0, 1, 1.5, 2, 1.5, 1, 0, 0, 0, 0, 0])
D = np.array([0., 0, 0, 0, 0, -2, -4, -2, 0, 0, 0, 0, 0])
x = np.arange(0,len(A),1)
I thought the best way to do it would be to normalize these signals and get absolute values (their attenuation is not important for me at this stage, I am interested in the position... but I might be wrong, so I will welcome thoughts about this concept too) and calculate the area where they overlap. I am following up on this answer - the solution looked very elegant and simple, but I may be implementing it wrongly.
def normalize(sig):
#ns = sig/max(np.abs(sig))
ns = sig/sum(sig)
return ns
a = normalize(A)
b = normalize(B)
c = normalize(C)
d = normalize(D)
which then look like this:
But then, when I try to implement the solution from the answer, I run into problems.
OLD
for c1,w1 in enumerate([a,b,c,d]):
for c2,w2 in enumerate([a,b,c,d]):
w1 = np.abs(w1)
w2 = np.abs(w2)
M[c1,c2] = integrate.trapz(min(np.abs(w2).any(),np.abs(w1).any()))
print M
Produces TypeError: 'numpy.bool_' object is not iterable or IndexError: list assignment index out of range. But I only included the .any() because without them, I was getting the ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
EDIT - NEW
(thanks #Kody King)
The new code is now:
M = np.zeros([4,4])
SH = np.zeros([4,4])
for c1,w1 in enumerate([a,b,c,d]):
for c2,w2 in enumerate([a,b,c,d]):
crossCorrelation = np.correlate(w1,w2, 'full')
bestShift = np.argmax(crossCorrelation)
# This reverses the effect of the padding.
actualShift = bestShift - len(w2) + 1
similarity = crossCorrelation[bestShift]
M[c1,c2] = similarity
SH[c1,c2] = actualShift
M = M/M.max()
print M, '\n', SH
And the output:
[[ 1. 1. 0.95454545 0.63636364]
[ 1. 1. 0.95454545 0.63636364]
[ 0.95454545 0.95454545 0.95454545 0.63636364]
[ 0.63636364 0.63636364 0.63636364 0.54545455]]
[[ 0. -2. 1. 0.]
[ 2. 0. 3. 2.]
[-1. -3. 0. -1.]
[ 0. -2. 1. 0.]]
The matrix of shifts looks ok now, but the actual correlation matrix does not. I am really puzzled by the fact that the lowest correlation value is for correlating d with itself. What I would like to achieve now is that:
EDIT - UPDATE
Following on the advice, I used the recommended normalization formula (dividing the signal by its sum), but the problem wasn't solved, just reversed. Now the correlation of d with d is 1, but all the other signals don't correlate with themselves.
New output:
[[ 0.45833333 0.45833333 0.5 0.58333333]
[ 0.45833333 0.45833333 0.5 0.58333333]
[ 0.5 0.5 0.57142857 0.66666667]
[ 0.58333333 0.58333333 0.66666667 1. ]]
[[ 0. -2. 1. 0.]
[ 2. 0. 3. 2.]
[-1. -3. 0. -1.]
[ 0. -2. 1. 0.]]
The correlation value should be highest for correlating a signal with itself (i.e. to have the highest values on the main diagonal).
To get the correlation values in the range between 0 and 1, so as a result, I would have 1s on the main diagonal and other numbers (0.x) elsewhere.
I was hoping the M = M/M.max() would do the job, but only if condition no. 1 is fulfilled, which it currently isn't.
As ssm said numpy's correlate function works well for this problem. You mentioned that you are interested in the position. The correlate function can also help you tell how far one sequence is shifted from another.
import numpy as np
def compare(a, b):
# 'full' pads the sequences with 0's so they are correlated
# with as little as 1 actual element overlapping.
crossCorrelation = np.correlate(a,b, 'full')
bestShift = np.argmax(crossCorrelation)
# This reverses the effect of the padding.
actualShift = bestShift - len(b) + 1
similarity = crossCorrelation[bestShift]
print('Shift: ' + str(actualShift))
print('Similatiy: ' + str(similarity))
return {'shift': actualShift, 'similarity': similarity}
print('\nExpected shift: 0')
compare([0,0,1,0,0], [0,0,1,0,0])
print('\nExpected shift: 2')
compare([0,0,1,0,0], [1,0,0,0,0])
print('\nExpected shift: -2')
compare([1,0,0,0,0], [0,0,1,0,0])
Edit:
You need to normalize each sequence before correlating them, or the larger sequences will have a very high correlation with the all the other sequences.
A property of cross-correlation is that:
So if you normalize by dividing each sequence by it's sum, the similarity will always be between 0 and 1.
I recommend you don't take the absolute value of a sequence. That changes the shape, not just the scale. For instance np.abs([1, -2]) == [1, 2]. Normalizing will already ensure that sequence is mostly positive and adds up to 1.
Second Edit:
I had a realization. Think of the signals as vectors. Normalized vectors always have a max dot product with themselves. Cross-Correlation is just a dot product calculated at various shifts. If you normalize the signals like you would a vector (divide s by sqrt(s dot s)), the self correlations will always be maximal and 1.
import numpy as np
def normalize(s):
magSquared = np.correlate(s, s) # s dot itself
return s / np.sqrt(magSquared)
a = np.array([0., 0, 0, 1., 2., 3., 4., 3, 2, 1, 0, 0, 0])
b = np.array([0., 0, 0, 0, 0, 1, 2., 3., 4, 3, 2, 1, 0])
c = np.array([0., 0, 0, 1, 1.5, 2, 1.5, 1, 0, 0, 0, 0, 0])
d = np.array([0., 0, 0, 0, 0, -2, -4, -2, 0, 0, 0, 0, 0])
a = normalize(a)
b = normalize(b)
c = normalize(c)
d = normalize(d)
M = np.zeros([4,4])
SH = np.zeros([4,4])
for c1,w1 in enumerate([a,b,c,d]):
for c2,w2 in enumerate([a,b,c,d]):
# Taking the absolute value catches signals which are flipped.
crossCorrelation = np.abs(np.correlate(w1, w2, 'full'))
bestShift = np.argmax(crossCorrelation)
# This reverses the effect of the padding.
actualShift = bestShift - len(w2) + 1
similarity = crossCorrelation[bestShift]
M[c1,c2] = similarity
SH[c1,c2] = actualShift
print(M, '\n', SH)
Outputs:
[[ 1. 1. 0.97700842 0.86164044]
[ 1. 1. 0.97700842 0.86164044]
[ 0.97700842 0.97700842 1. 0.8819171 ]
[ 0.86164044 0.86164044 0.8819171 1. ]]
[[ 0. -2. 1. 0.]
[ 2. 0. 3. 2.]
[-1. -3. 0. -1.]
[ 0. -2. 1. 0.]]
You want to use a cross-correlation between the vectors:
https://en.wikipedia.org/wiki/Cross-correlation
https://docs.scipy.org/doc/numpy/reference/generated/numpy.correlate.html
For example:
>>> np.correlate(A,B)
array([ 31.])
>>> np.correlate(A,C)
array([ 19.])
>>> np.correlate(A,D)
array([-28.])
If you don't care about the sign, you can simply take the absolute value ...
I'm having trouble understanding a basic concept with tensorflow. How does indexing work for tensor read/write operations? In order to make this specific, how can the following numpy examples be translated to tensorflow (using tensors for the arrays, indices and values being assigned):
x = np.zeros((3, 4))
row_indices = np.array([1, 1, 2])
col_indices = np.array([0, 2, 3])
x[row_indices, col_indices] = 2
x
with output:
array([[ 0., 0., 0., 0.],
[ 2., 0., 2., 0.],
[ 0., 0., 0., 2.]])
... and ...
x[row_indices, col_indices] = np.array([5, 4, 3])
x
with output:
array([[ 0., 0., 0., 0.],
[ 5., 0., 4., 0.],
[ 0., 0., 0., 3.]])
... and finally ...
y = x[row_indices, col_indices]
y
with output:
array([ 5., 4., 3.])
There's github issue #206 to support this nicely, meanwhile you have to resort to verbose work-arounds
The first example can be done with tf.select that combines two same-shaped tensors by selecting each element from one or the other
tf.reset_default_graph()
row_indices = tf.constant([1, 1, 2])
col_indices = tf.constant([0, 2, 3])
x = tf.zeros((3, 4))
sess = tf.InteractiveSession()
# get list of ((row1, col1), (row2, col2), ..)
coords = tf.transpose(tf.pack([row_indices, col_indices]))
# get tensor with 1's at positions (row1, col1),...
binary_mask = tf.sparse_to_dense(coords, x.get_shape(), 1)
# convert 1/0 to True/False
binary_mask = tf.cast(binary_mask, tf.bool)
twos = 2*tf.ones(x.get_shape())
# make new x out of old values or 2, depending on mask
x = tf.select(binary_mask, twos, x)
print x.eval()
gives
[[ 0. 0. 0. 0.]
[ 2. 0. 2. 0.]
[ 0. 0. 0. 2.]]
The second one could be done with scatter_update, except scatter_update only supports on linear indices and works on variables. So you could create a temporary variable and use reshaping like this. (to avoid variables you could use dynamic_stitch, see the end)
# get linear indices
linear_indices = row_indices*x.get_shape()[1]+col_indices
# turn 'x' into 1d variable since "scatter_update" supports linear indexing only
x_flat = tf.Variable(tf.reshape(x, [-1]))
# no automatic promotion, so make updates float32 to match x
updates = tf.constant([5, 4, 3], dtype=tf.float32)
sess.run(tf.initialize_all_variables())
sess.run(tf.scatter_update(x_flat, linear_indices, updates))
# convert back into original shape
x = tf.reshape(x_flat, x.get_shape())
print x.eval()
gives
[[ 0. 0. 0. 0.]
[ 5. 0. 4. 0.]
[ 0. 0. 0. 3.]]
Finally the third example is already supported with gather_nd, you write
print tf.gather_nd(x, coords).eval()
To get
[ 5. 4. 3.]
Edit, May 6
The update x[cols,rows]=newvals can be done without using Variables (which occupy memory between session run calls) by using select with sparse_to_dense that takes vector of sparse values, or relying on dynamic_stitch
sess = tf.InteractiveSession()
x = tf.zeros((3, 4))
row_indices = tf.constant([1, 1, 2])
col_indices = tf.constant([0, 2, 3])
# no automatic promotion, so specify float type
replacement_vals = tf.constant([5, 4, 3], dtype=tf.float32)
# convert to linear indexing in row-major form
linear_indices = row_indices*x.get_shape()[1]+col_indices
x_flat = tf.reshape(x, [-1])
# use dynamic stitch, it merges the array by taking value either
# from array1[index1] or array2[index2], if indices conflict,
# the later one is used
unchanged_indices = tf.range(tf.size(x_flat))
changed_indices = linear_indices
x_flat = tf.dynamic_stitch([unchanged_indices, changed_indices],
[x_flat, replacement_vals])
x = tf.reshape(x_flat, x.get_shape())
print x.eval()
Looking for examples of how to use image processing tools to "describe" images and shapes of any sort, I have stumbled upon the Scikit-image skimage.measure.moments_central(image, cr, cc, order=3) function.
They give an example of how to use this function:
from skimage import measure #Package name in Enthought Canopy
import numpy as np
image = np.zeros((20, 20), dtype=np.double) #Square image of zeros
image[13:17, 13:17] = 1 #Adding a square of 1s
m = moments(image)
cr = m[0, 1] / m[0, 0] #Row of the centroid (x coordinate)
cc = m[1, 0] / m[0, 0] #Column of the centroid (y coordinate)
In[1]: moments_central(image, cr, cc)
Out[1]:
array([[ 16., 0., 20., 0.],
[ 0., 0., 0., 0.],
[ 20., 0., 25., 0.],
[ 0., 0., 0., 0.]])
1) What do each of the values represent? Since the (0,0) element is 16, I get this number corresponds to the area of the square of 1s, and therefore it is mu zero-zero. But how about the others?
2) Is this always a symmetric matrix?
3) What are the values associated with the famous second central moments?
The array returned by measure.moments_central correspond to the formula of https://en.wikipedia.org/wiki/Image_moment (section central moment). mu_00 corresponds indeed to the area of the object.
The inertia matrix is not always symmetric, as shown by this example where the object is a rectangle instead of a square.
>>> image = np.zeros((20, 20), dtype=np.double) #Square image of zeros
>>> image[14:16, 13:17] = 1
>>> m = measure.moments(image)
>>> cr = m[0, 1] / m[0, 0]
>>> cc = m[1, 0] / m[0, 0]
>>> measure.moments_central(image, cr, cc)
array([[ 8. , 0. , 2. , 0. ],
[ 0. , 0. , 0. , 0. ],
[ 10. , 0. , 2.5, 0. ],
[ 0. , 0. , 0. , 0. ]])
As for second-order moments, they are mu_02, mu_11, and mu_20 (coefficients on the diagonal i + j = 1). The same Wikipedia page https://en.wikipedia.org/wiki/Image_moment explains how to use second-order moments for computing the orientation of objects.
I'm doing a project and I'm doing a lot of matrix computation in it.
I'm looking for a smart way to speed up my code. In my project, I'm dealing with a sparse matrix of size 100Mx1M with around 10M non-zeros values. The example below is just to see my point.
Let's say I have:
A vector v of size (2)
A vector c of size (3)
A sparse matrix X of size (2,3)
v = np.asarray([10, 20])
c = np.asarray([ 2, 3, 4])
data = np.array([1, 1, 1, 1])
row = np.array([0, 0, 1, 1])
col = np.array([1, 2, 0, 2])
X = coo_matrix((data,(row,col)), shape=(2,3))
X.todense()
# matrix([[0, 1, 1],
# [1, 0, 1]])
Currently I'm doing:
result = np.zeros_like(v)
d = scipy.sparse.lil_matrix((v.shape[0], v.shape[0]))
d.setdiag(v)
tmp = d * X
print tmp.todense()
#matrix([[ 0., 10., 10.],
# [ 20., 0., 20.]])
# At this point tmp is csr sparse matrix
for i in range(tmp.shape[0]):
x_i = tmp.getrow(i)
result += x_i.data * ( c[x_i.indices] - x_i.data)
# I only want to do the subtraction on non-zero elements
print result
# array([-430, -380])
And my problem is the for loop and especially the subtraction.
I would like to find a way to vectorize this operation by subtracting only on the non-zero elements.
Something to get directly the sparse matrix on the subtraction:
matrix([[ 0., -7., -6.],
[ -18., 0., -16.]])
Is there a way to do this smartly ?
You don't need to loop over the rows to do what you are already doing. And you can use a similar trick to perform the multiplication of the rows by the first vector:
import scipy.sparse as sps
# number of nonzero entries per row of X
nnz_per_row = np.diff(X.indptr)
# multiply every row by the corresponding entry of v
# You could do this in-place as:
# X.data *= np.repeat(v, nnz_per_row)
Y = sps.csr_matrix((X.data * np.repeat(v, nnz_per_row), X.indices, X.indptr),
shape=X.shape)
# subtract from the non-zero entries the corresponding column value in c...
Y.data -= np.take(c, Y.indices)
# ...and multiply by -1 to get the value you are after
Y.data *= -1
To see that it works, set up some dummy data
rows, cols = 3, 5
v = np.random.rand(rows)
c = np.random.rand(cols)
X = sps.rand(rows, cols, density=0.5, format='csr')
and after run the code above:
>>> x = X.toarray()
>>> mask = x == 0
>>> x *= v[:, np.newaxis]
>>> x = c - x
>>> x[mask] = 0
>>> x
array([[ 0.79935123, 0. , 0. , -0.0097763 , 0.59901243],
[ 0.7522559 , 0. , 0.67510109, 0. , 0.36240006],
[ 0. , 0. , 0.72370725, 0. , 0. ]])
>>> Y.toarray()
array([[ 0.79935123, 0. , 0. , -0.0097763 , 0.59901243],
[ 0.7522559 , 0. , 0.67510109, 0. , 0.36240006],
[ 0. , 0. , 0.72370725, 0. , 0. ]])
The way you are accumulating your result requires that there are the same number of non-zero entries in every row, which seems a pretty weird thing to do. Are you sure that is what you are after? If that's really what you want you could get that value with something like:
result = np.sum(Y.data.reshape(Y.shape[0], -1), axis=0)
but I have trouble believing that is really what you are after...