I have two matrices mat1 and mat2 that are sparse (most entries are zero) and I'm not interested in the zero-valued entries: I look at the matrices from a graph-theoretical perspective where a zero means that there is no edge between the nodes.
How can I efficiently get the minimum values between non-zero entries only using scipy.sparse matrices?
I.e. an equivalent of mat1.minimum(mat2) that would ignore implicit zeros.
Using dense matrices, it is fairly easy to do:
import numpy as np
nnz = np.where(np.multiply(mat1, mat2))
m = mat1 + mat2
m[nnz] = np.minimum(mat1[nnz], mat2[nnz])
But this would be very inefficient with sparse matrices.
NB: a similar question has been asked before but did not get any relevant answer and there is a related PR on the scipy repo that proposes an implementation of this for (arg)min/max but not for minimum.
EDIT: to specify a bit more the desired behavior would be commutative, i.e. this nonzero-minimum would take all values present in only one of the two matrices and the min of the entries that are present in both matrices
Just in case someone also looks for this, my current implementation is below.
However, I'd appreciate any proposal that would either speed this up or reduce the memory footprint.
s = mat1.multiply(mat2)
s.data[:] = 1.
a1 = mat1.copy()
a1.data[:] = 1.
a1 = (a1 - s).maximum(0)
a2 = mat2.copy()
a2.data[:] = 1.
a2 = (a2 - s).maximum(0)
res = mat1.multiply(a1) + mat2.multiply(a2) + \
mat1.multiply(s).minimum(mat2.multiply(s))
If the sparse nonzeros are positive, an alternate way to use the correct UNION behavior of maximum might
be to negate and make positive.
Following your lead of mucking with data explicitly. I found
def sp_min_nz_positive(asp,bsp): # a and b scipy sparse
amax = asp.max()
bmax = bsp.max()
abmaxplus = max(amax, bmax) # + 1.0 : surprise! not needed.
# invert the direction, while remaining positive
arev = asp.copy()
arev.data[:] = abmaxplus - asp.data[:]
brev = bsp.copy()
brev.data[:] = abmaxplus - bsp.data[:]
out = arev.maximum(brev) #
# revert the direction of these positives
out.data[:] = abmaxplus - out.data[:]
return out
there may be inexactness due to roundoff
There was also a suggestion to use sparse internals. A rather generic
function is sp.find which returns the nonzero elements of anything.
So you could also try out a minimum that handles negative values too, with something like:
import scipy.sparse as sp
def sp_min_union(a, b):
assert a.shape == b.shape
assert sp.issparse(a) and sp.issparse(b)
(ra,ca,_) = sp.find(a) # over nonzeros only
(rb,cb,_) = sp.find(b) # over nonzeros only
setab = set(zip(ra,ca)).union(zip(rb,cb)) # row-column union-of-nonzero
r=[]
c=[]
v=[]
for (rr,cc) in setab:
r.append(rr)
c.append(cc)
anz = a[rr,cc]
bnz = b[rr,cc]
assert anz!=0 or bnz!=0 # they came from *some* sp.find
if anz==0: anz = bnz
#else:
# #if bnz==0: anz = anz
# #else: anz=min(anz,bnz)
# equiv.
elif bnz!=0: anz=min(anz,bnz)
v.append(anz)
# choose what sparse output format you want, many seem
# constructible as:
return sp.csr_matrix((v, (r,c)), shape=a.shape)
Related
I'm trying to optimize the function 'pw' in the following code using only NumPy functions (or perhaps list comprehensions).
from time import time
import numpy as np
def pw(x, udata):
"""
Creates the step function
| 1, if d0 <= x < d1
| 2, if d1 <= x < d2
pw(x,data) = ...
| N, if d(N-1) <= x < dN
| 0, otherwise
where di is the ith element in data.
INPUT: x -- interval which the step function is defined over
data -- an ordered set of data (without repetitions)
OUTPUT: pw_func -- an array of size x.shape[0]
"""
vals = np.arange(1,udata.shape[0]+1).reshape(udata.shape[0],1)
pw_func = np.sum(np.where(np.greater_equal(x,udata)*np.less(x,np.roll(udata,-1)),vals,0),axis=0)
return pw_func
N = 50000
x = np.linspace(0,10,N)
data = [1,3,4,5,5,7]
udata = np.unique(data)
ti = time()
pw(x,udata)
tf = time()
print(tf - ti)
import cProfile
cProfile.run('pw(x,udata)')
The cProfile.run is telling me that most of the overhead is coming from np.where (about 1 ms) but I'd like to create faster code if possible. It seems that performing the operations row-wise versus column-wise makes some difference, unless I'm mistaken, but I think I've accounted for it. I know that sometimes list comprehensions can be faster but I couldn't figure out a faster way than what I'm doing using it.
Searchsorted seems to yield better performance but that 1 ms still remains on my computer:
(modified)
def pw(xx, uu):
"""
Creates the step function
| 1, if d0 <= x < d1
| 2, if d1 <= x < d2
pw(x,data) = ...
| N, if d(N-1) <= x < dN
| 0, otherwise
where di is the ith element in data.
INPUT: x -- interval which the step function is defined over
data -- an ordered set of data (without repetitions)
OUTPUT: pw_func -- an array of size x.shape[0]
"""
inds = np.searchsorted(uu, xx, side='right')
vals = np.arange(1,uu.shape[0]+1)
pw_func = vals[inds[inds != uu.shape[0]]]
num_mins = np.sum(xx < np.min(uu))
num_maxs = np.sum(xx > np.max(uu))
pw_func = np.concatenate((np.zeros(num_mins), pw_func, np.zeros(xx.shape[0]-pw_func.shape[0]-num_mins)))
return pw_func
This answer using piecewise seems pretty close, but that's on a scalar x0 and x1. How would I do it on arrays? And would it be more efficient?
Understandably, x may be pretty big but I'm trying to put it through a stress test.
I am still learning though so some hints or tricks that can help me out would be great.
EDIT
There seems to be a mistake in the second function since the resulting array from the second function doesn't match the first one (which I'm confident that it works):
N1 = pw1(x,udata.reshape(udata.shape[0],1)).shape[0]
N2 = np.sum(pw1(x,udata.reshape(udata.shape[0],1)) == pw2(x,udata))
print(N1 - N2)
yields
15000
data points that are not the same. So it seems that I don't know how to use 'searchsorted'.
EDIT 2
Actually I fixed it:
pw_func = vals[inds[inds != uu.shape[0]]]
was changed to
pw_func = vals[inds[inds[(inds != uu.shape[0])*(inds != 0)]-1]]
so at least the resulting arrays match. But the question still remains on whether there's a more efficient way of going about doing this.
EDIT 3
Thanks Tin Lai for pointing out the mistake. This one should work
pw_func = vals[inds[(inds != uu.shape[0])*(inds != 0)]-1]
Maybe a more readable way of presenting it would be
non_endpts = (inds != uu.shape[0])*(inds != 0) # only consider the points in between the min/max data values
shift_inds = inds[non_endpts]-1 # searchsorted side='right' includes the left end point and not right end point so a shift is needed
pw_func = vals[shift_inds]
I think I got lost in all those brackets! I guess that's the importance of readability.
A very abstract yet interesting problem! Thanks for entertaining me, I had fun :)
p.s. I'm not sure about your pw2 I wasn't able to get it output the same as pw1.
For reference the original pws:
def pw1(x, udata):
vals = np.arange(1,udata.shape[0]+1).reshape(udata.shape[0],1)
pw_func = np.sum(np.where(np.greater_equal(x,udata)*np.less(x,np.roll(udata,-1)),vals,0),axis=0)
return pw_func
def pw2(xx, uu):
inds = np.searchsorted(uu, xx, side='right')
vals = np.arange(1,uu.shape[0]+1)
pw_func = vals[inds[inds[(inds != uu.shape[0])*(inds != 0)]-1]]
num_mins = np.sum(xx < np.min(uu))
num_maxs = np.sum(xx > np.max(uu))
pw_func = np.concatenate((np.zeros(num_mins), pw_func, np.zeros(xx.shape[0]-pw_func.shape[0]-num_mins)))
return pw_func
My first attempt was utilising a lot of boardcasting operation from numpy:
def pw3(x, udata):
# the None slice is to create new axis
step_bool = x >= udata[None,:].T
# we exploit the fact that bools are integer value of 1s
# skipping the last value in "data"
step_vals = np.sum(step_bool[:-1], axis=0)
# for the step_bool that we skipped from previous step (last index)
# we set it to zerp so that we can negate the step_vals once we reached
# the last value in "data"
step_vals[step_bool[-1]] = 0
return step_vals
After looking at the searchsorted from your pw2 I had a new approach that utilise it with much higher performance:
def pw4(x, udata):
inds = np.searchsorted(udata, x, side='right')
# fix-ups the last data if x is already out of range of data[-1]
if x[-1] > udata[-1]:
inds[inds == inds[-1]] = 0
return inds
Plots with:
plt.plot(pw1(x,udata.reshape(udata.shape[0],1)), label='pw1')
plt.plot(pw2(x,udata), label='pw2')
plt.plot(pw3(x,udata), label='pw3')
plt.plot(pw4(x,udata), label='pw4')
with data = [1,3,4,5,5,7]:
with data = [1,3,4,5,5,7,11]
pw1,pw3,pw4 are all identical
print(np.all(pw1(x,udata.reshape(udata.shape[0],1)) == pw3(x,udata)))
>>> True
print(np.all(pw1(x,udata.reshape(udata.shape[0],1)) == pw4(x,udata)))
>>> True
Performance: (timeit by default runs 3 times, average of number=N of times)
print(timeit.Timer('pw1(x,udata.reshape(udata.shape[0],1))', "from __main__ import pw1, x, udata").repeat(number=1000))
>>> [3.1938983199979702, 1.6096494779994828, 1.962694135003403]
print(timeit.Timer('pw2(x,udata)', "from __main__ import pw2, x, udata").repeat(number=1000))
>>> [0.6884554479984217, 0.6075002400029916, 0.7799002879983163]
print(timeit.Timer('pw3(x,udata)', "from __main__ import pw3, x, udata").repeat(number=1000))
>>> [0.7369808239964186, 0.7557657590004965, 0.8088172269999632]
print(timeit.Timer('pw4(x,udata)', "from __main__ import pw4, x, udata").repeat(number=1000))
>>> [0.20514375300263055, 0.20203858999957447, 0.19906871100101853]
I have a weird problem with iterators, which I can't figure out. I have a complicated numerical routine returning a generator object (or after some changes to the code an islice). Afterwards I check, the results as I know that the results must have a negative imaginary part:
import numpy as np
threshold = 1e-8 # just check up to some numerical accuracy
results = result_generator(**inputs)
is_valid = [np.all(_result.imag < threshold) for _result in results]
print("Number of valid results: ", is_valid.count(True))
(Sorry for not giving an executable code, but I can't come up with a simple code at the moment.)
The problem is now, that this returns one valid solution. If I change the code to
import numpy as np
threshold = 1e-8 # just check up to some numerical accuracy
results = list(result_generator(**inputs))
is_valid = [np.all(_result.imag < threshold) for _result in results]
print("Number of valid results: ", is_valid.count(True))
using a list instead of a generator, I get zero valid solution. I can however not wrap my head around what is different and thus have no idea how to debug the problem.
If I go through the debugger and print out the result with the corresponding index the results are even different, the one of the generator is correct, the one of the list is wrong.
Here the numerical function:
def result_generator(z, iw, coeff, n_min, n_max):
assert n_min >= 1
assert n_min < n_max
if n_min % 2:
# index must be even
n_min += 1
id1 = np.ones_like(z, dtype=complex)
A0, A1 = 0.*id1, coeff[0]*id1
A2 = coeff[0] * id1
B2 = 1. * id1
multiplier = np.subtract.outer(z, iw[:-1])*coeff[1:]
multiplier = np.moveaxis(multiplier, -1, 0).copy()
def _iteration(multiplier_im):
multiplier_im = multiplier_im/B2
A2[:] = A1 + multiplier_im*A0
B2[:] = 1. + multiplier_im
A0[:] = A1
A1[:] = A2 / B2
return A1
complete_iterations = (_iteration(multiplier_im) for multiplier_im in multiplier)
return islice(complete_iterations, n_min, n_max, 2)
You're yielding the same array over and over instead of making new arrays. When you call list, you get a list of references to the same array, and that array is in its final state. When you don't call list, you examine the array in the state the generator yields it, each time it's yielded.
Stop reusing the same array over and over.
I'm new to Python and looking to find an optimized solution with a number of constraints, where those constraints are based on functions of the outputs.
First two constraints are straightforward:
1) output1 +output2 + output3 = 1
2) output1, output2, and output3 must all be >= 0
Last constraint needs functions of the outputs to be EQUAL:
3) f(output1) == f(output2) == f(output3)
In this case the function array is produced a matrix multiplication of the outputs:
F = cov.dot(array([output1,output2,output3]))*array([output1,output2,output3])
f(output1) = F[0], f(output2) = F[1], f(output3) = F [2]
Hopefully I've described the problem clearly... Eventually I want to extend this to more outputs than 3.
What I have below gives me output values that don't appear to follow the constraints at all (gives me a negative value). I assume I'm entering the constraints wrong... or perhaps there is an easier way to do this with np.linalg.solve?
import numpy as np
from scipy.optimize import fsolve
cov=np.array([0.04,0.0015,0.03,0.0015,0.0025,0.000625,0.03,0.000625,0.0625]).reshape(3,3)
weights=np.array([0.3,0.2,0.5])
def RC(w):
return cov.dot(w)*w
riskcont = RC(weights)
def PV(riskcont):
return np.sqrt(riskcont.sum())
portvol = PV(riskcont)
def ERC(z):
w1=z[0]
w2=z[1]
w3=z[2]
#1) weights sum to 100%
out=w1 +w2 +w3 -1
#2) weights above zero
out.append((w1*w2*w3)>0)
#3) riskcont must all be equal
out.append([riskcont[0] == riskcont[1] == riskcont[2]] #== riskcont(w4)))
return out
z= fsolve(ERC,[1/3,1/3,1/3])
Looking for occurrences of a pattern on each row of a matrix, I found that there was not clear solution to do it on python for very big matrix having a good performance.
I have a matrix similar to
matrix = np.array([[0,1,1,0,1,0],
[0,1,1,0,1,0]])
print 'matrix: ', matrix
where I want to check the occurreces of patterns [0,0], [0,1] [1,0] and [1,1] on each rowconsidering overlapping. For the example given, where both rows are equal,ther result is equal for each pattern:
pattern[0,0] = [0,0]
pattern[0,1] = [2,2]
pattern[1,0] = [2,2]
pattern[1,1] = [1,1]
The matrix in this example is quite small, but I am looking for performance as I have a huge matrix. You can test matrix with matrix = numpy.random.randint(2, size=(100000,10)) or bigger for example to see the differences
First I though on a possible answer converting rows to strings and looking for occurrences based on this answer (string count with overlapping occurrences):
def string_occurrences(matrix):
print '\n===== String count with overlapping ====='
numRow,numCol = np.shape(matrix)
Ocur = np.zeros((numRow,4))
for i in range(numRow):
strList = ''.join(map(str,matrix[i,:]))
Ocur[i,0] = occurrences(strList,'00')
Ocur[i,1] = occurrences(strList,'01')
Ocur[i,2] = occurrences(strList,'10')
Ocur[i,3] = occurrences(strList,'11')
return Ocur
using the function occurrences of the answer
def occurrences(string, sub):
count = start = 0
while True:
start = string.find(sub, start) + 1
if start > 0:
count+=1
else:
return count
but considering that the real array is huge, this solution is very very slow as it uses for loops, strings,...
So looking for a numpy solution I used a trick to compare the values with a pattern and roll the matrix on axis=1 to check all the occurrences.
I call it pseudo rolling window on 2D as the window is not square and the way of calculation is different. There are 2 options, where the second (Option 2) is faster because it avoids the extra calculation of numpy.roll
def pseudo_rolling_window_Opt12(matrix):
print '\n===== pseudo_rolling_window ====='
numRow,numCol = np.shape(matrix)
Ocur = np.zeros((numRow,4))
index = 0
for i in np.arange(2):
for j in np.arange(2):
#pattern = -9*np.ones(numCol) # Option 1
pattern = -9*np.ones(numCol+1) # Option 2
pattern[0] = i
pattern[1] = j
for idCol in range(numCol-1):
#Ocur[:,index] += np.sum(np.roll(matrix,-idCol, axis=1) == pattern, axis=1) == 2 # Option 1: 219.398691893 seconds (for my real matrix)
Ocur[:,index] += np.sum(matrix[:,idCol:] == pattern[:-(idCol+1)], axis=1) == 2 # Option 2: 80.929688930 seconds (for my real matrix)
index += 1
return Ocur
Searching for other possibilities, I found the "rolling window" which seemed to be a god answer for performance as it used the numpy function. Looking to this answer (Rolling window for 1D arrays in Numpy?) and the links on it, I checked the following function. But really, I do not understand the output as it seems that the calculations of the window are matching what I was expecting for result.
def rolling_window(a, size):
shape = a.shape[:-1] + (a.shape[-1] - size + 1, size)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
Used as:
a = rolling_window(matrix, 2)
print a == np.array([0,1])
print np.all(rolling_window(matrix, 2) == [0,1], axis=1)
Does someone know what is wrong on this last case? Or could be any possibility with better performance?
You are using the wrong axis of the numpy array. You should change the axis in np.all from 1 to 2.
Using the following code:
a = rolling_window(matrix, 2)
print np.all(rolling_window(matrix, 2) == [0,1], axis=2)
you get:
>>>[[ True False False True False]
[ True False False True False]]
So, in order to get the results you are looking for:
print np.sum(np.all(rolling_window(matrix, 2) == [0,1], axis=2),axis=1)
>>>[2 2]
I'm rather new to NumPy. Anyone have an idea for making this code, especially the nested loops, more compact/efficient? BTW, dist and data are three-dimensional numpy arrays.
def interpolate_to_distance(self,distance):
interpolated_data=np.ndarray(self.dist.shape[1:])
for j in range(interpolated_data.shape[1]):
for i in range(interpolated_data.shape[0]):
interpolated_data[i,j]=np.interp(
distance,self.dist[:,i,j],self.data[:,i,j])
return(interpolated_data)
Thanks!
Alright, I'll take a swag with this:
def interpolate_to_distance(self, distance):
dshape = self.dist.shape
dist = self.dist.T.reshape(-1, dshape[-1])
data = self.data.T.reshape(-1, dshape[-1])
intdata = np.array([np.interp(distance, di, da)
for di, da in zip(dist, data)])
return intdata.reshape(dshape[0:2]).T
It at least removes one loop (and those nested indices), but it's not much faster than the original, ~20% faster according to %timeit in IPython. On the other hand, there's a lot of (probably unnecessary, ultimately) transposing and reshaping going on.
For the record, I wrapped it up in a dummy class and filled some 3 x 3 x 3 arrays with random numbers to test:
import numpy as np
class TestClass(object):
def interpolate_to_distance(self, distance):
dshape = self.dist.shape
dist = self.dist.T.reshape(-1, dshape[-1])
data = self.data.T.reshape(-1, dshape[-1])
intdata = np.array([np.interp(distance, di, da)
for di, da in zip(dist, data)])
return intdata.reshape(dshape[0:2]).T
def interpolate_to_distance_old(self, distance):
interpolated_data=np.ndarray(self.dist.shape[1:])
for j in range(interpolated_data.shape[1]):
for i in range(interpolated_data.shape[0]):
interpolated_data[i,j]=np.interp(
distance,self.dist[:,i,j],self.data[:,i,j])
return(interpolated_data)
if __name__ == '__main__':
testobj = TestClass()
testobj.dist = np.random.randn(3, 3, 3)
testobj.data = np.random.randn(3, 3, 3)
distance = 0
print 'Old:\n', testobj.interpolate_to_distance_old(distance)
print 'New:\n', testobj.interpolate_to_distance(distance)
Which prints (for my particular set of randoms):
Old:
[[-0.59557042 -0.42706077 0.94629049]
[ 0.55509032 -0.67808257 -0.74214045]
[ 1.03779189 -1.17605275 0.00317679]]
New:
[[-0.59557042 -0.42706077 0.94629049]
[ 0.55509032 -0.67808257 -0.74214045]
[ 1.03779189 -1.17605275 0.00317679]]
I also tried np.vectorize(np.interp) but couldn't get that to work. I suspect that would be much faster if it did work.
I couldn't get np.fromfunction to work either, as it passed (2) 3 x 3 (in this case) arrays of indices to np.interp, the same arrays you get from np.mgrid.
One other note: according the the docs for np.interp,
np.interp does not check that the x-coordinate sequence xp is increasing. If
xp is not increasing, the results are nonsense. A simple check for
increasingness is::
np.all(np.diff(xp) > 0)
Obviously, my random numbers violate the 'always increasing' rule, but you'll have to be more careful.