I'm trying to create a certain style of band(ed) matrix (see Wikipedia). The following code works, but for large M (~300 or so) it becomes quite slow because of the for loop. Is there a way to vectorize it/make better use of NumPy and/or SciPy? I am having trouble figuring out the mathematical operation that this corresponds to, and hence I have not succeeded thus far.
The code I have is as follows
def banded_matrix(M):
phis = np.linspace(0, 2*np.pi, M)
i = 0
ham = np.zeros((int(2*M), int(2*M)))
for phi in phis:
ham_phi = np.array([[1, 1],
[1, -1]])*(1+np.cos(phi))
array_phi = np.zeros(M)
array_phi[i] = 1
mat_phi = np.diag(array_phi)
ham += np.kron(mat_phi, ham_phi)
i += 1
return ham
With %timeit banded_matrix(M=300) it takes about 4 seconds on my computer.
Since the code is a bit opaque, what I want to do is construct a large 2M by 2M matrix. In a sense it has M entries on it's 'width 2' diagonal, where the entries are 2x2 matrices ham_phi that depend on phi. The matrix will afterwards be diagonalized, so perhaps one could even make use of its structure/the fact that it is rather sparse to speed that up, but of that I am not sure.
If anyone has an idea where to go with this, I'd be happy to follow up on that!
Your matrix is diagonal by blocks, so you can use scipy.linalg.block_diag:
import numpy as np
from scipy.linalg import block_diag
def banded_matrix_scipy(M):
ham = np.array([[1, 1], [1, -1]])
phis = np.linspace(0, 2 * np.pi, M)
ham_phis = ham * (1 + np.cos(phis))[:, None, None]
return block_diag(*ham_phis)
Let's check that it works and is faster:
b1 = banded_matrix(300)
b2 = banded_matrix_scipy(300)
np.all(b1 == b2) # True
>>> %timeit banded_matrix(300)
>>> %timeit banded_matrix_scipy(300)
1.51 s ± 57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.24 ms ± 4.57 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
The obligatory np.einsum benchmark
def banded_matrix_einsum(M):
return np.einsum('ij, kl-> ikjl',
np.eye(M)*(1 + np.cos(np.linspace(0, 2 * np.pi, M))),
np.array([[1, 1], [1, -1]])).reshape(2*M, 2*M)
banded_matrix_einsum(4)
Output
array([[ 2. , 2. , 0. , 0. , 0. , 0. , 0. , 0. ],
[ 2. , -2. , 0. , 0. , 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0.5, 0.5, 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0.5, -0.5, 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0.5, 0.5, 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0.5, -0.5, 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. , 0. , 2. , 2. ],
[ 0. , 0. , 0. , 0. , 0. , 0. , 2. , -2. ]])
Benchmark results
import perfplot
perfplot.show(
setup = lambda M: M,
kernels = [banded_matrix_einsum, banded_matrix_scipy, banded_matrix],
n_range = [50, 100, 150, 200, 250, 300],
logx = False
)
scipy.linalg.block_diag vs np.einsum details
perfplot.show(
setup = lambda M: M,
kernels = [banded_matrix_einsum, banded_matrix_scipy],
n_range = [50, 100, 150, 200, 250, 300, 350, 400],
logx = False
)
As another way, you can use numba accelerator to speed it up with jitting. I propose an equivalent scipy.linalg.block_diag numba method that is based on paime answer:
import numba as nb
#nb.njit
def block_diag_numba(result, ham_phis):
for i in range(ham_phis.shape[0]):
for j in range(ham_phis.shape[1]):
result[i * 2, i * 2:i * 2 + 2] = ham_phis[i, 0]
result[i * 2 + 1, i * 2:i * 2 + 2] = ham_phis[i, 1]
return result
def numba_(M):
ham = np.array([[1, 1], [1, -1]])
phis = np.linspace(0, 2 * np.pi, M)
ham_phis = ham * (1 + np.cos(phis))[:, None, None]
return block_diag_numba(np.zeros((M * ham.shape[1], M * ham.shape[1])), ham_phis)
This method will be faster than the previous ones at least 4-5 times for up to m=400 (us scale). This method can be adjust for other array shapes and improved by optimizing the code further (not using the paime answer) and bringing all code lines to numba function or parallelizing. I didn't go further because the paime answer performance seemed to be satisfiable by the OP acceptance; Just to show we can use numba to write much faster scipy.linalg.block_diag equivalent code:
Related
I have different sized vectors and want to do element-wise manipulations. How can I optimize the following for-loop in Python? (For instance with np.vectorize())
import numpy as np
n = 1000000
vec1 = np.random.rand(n)
vec2 = np.random.rand(3*n)
vec3 = np.random.rand(3*n)
for i in range(len(vec1)):
if vec1[i] < 0.5:
vec2[3*i : 3*(i+1)] = vec1[i]*vec3[3*i : 3*(i+1)]
else:
vec2[3*i : 3*(i+1)] = [0,0,0]
Thanks a lot for your help.
We could leverage broadcasting -
v = vec3.reshape(-1,3)*vec1[:,None]
m = vec1<0.5
vec2_out = (v*m[:,None]).ravel()
Another way to express that would be -
mask = vec1<0.5
vec2_out = (vec3.reshape(-1,3)*(vec1*mask)[:,None]).ravel()
And use multi-cores with numexpr module -
import numexpr as ne
d = {'V3r':vec3.reshape(-1,3),'vec12D':vec1[:,None]}
out = ne.evaluate('V3r*vec12D*(vec12D<0.5)',d).ravel()
Timings -
In [84]: n = 1000000
...: np.random.seed(0)
...: vec1 = np.random.rand(n)
...: vec2 = np.random.rand(3*n)
...: vec3 = np.random.rand(3*n)
In [86]: %%timeit
...: v = vec3.reshape(-1,3)*vec1[:,None]
...: m = vec1<0.5
...: vec2_out = (v*m[:,None]).ravel()
10 loops, best of 3: 23.2 ms per loop
In [87]: %%timeit
...: mask = vec1<0.5
...: vec2_out = (vec3.reshape(-1,3)*(vec1*mask)[:,None]).ravel()
100 loops, best of 3: 13.1 ms per loop
In [88]: %%timeit
...: d = {'V3r':vec3.reshape(-1,3),'vec12D':vec1[:,None]}
...: out = ne.evaluate('V3r*vec12D*(vec12D<0.5)',d).ravel()
100 loops, best of 3: 4.11 ms per loop
For a generic case, where the else-part could be something other than zeros, it would be -
mask = vec1<0.5
IF_vals = vec3.reshape(-1,3)*vec1[:,None]
ELSE_vals = np.array([1,1,1])
out = np.where(mask[:,None],IF_vals,ELSE_vals).ravel()
numpy.vectorize, as mentioned in the comments, is for convenience, not performance, per the docs:
The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
One solution to actually vectorize this would be:
vec2[:] = vec1.repeat(3) * vec3 # Bulk compute all results
vec2[(vec1 < 0.5).repeat(3)] = 0 # Zero the results you meant to exclude
Another approach (that minimizes temporaries) would be to filter and reshape vec1 so it can be assigned to vec2, then multiply vec2 by vec3 in place to avoid a temporary (beyond the two n length arrays from the first step), e.g.:
vec2.reshape(-1, 3)[:] = (vec1 * (vec1 >= 0.5)).reshape(-1, 1)
vec2 *= vec3
An additional temporary could be shaved if vec1 can be modified, simplifying to:
vec1 *= vec1 >= 0.5
vec2.reshape(-1, 3)[:] = vec1.reshape(-1, 1)
vec2 *= vec3
The reshape/broadcasting that #Divakar demonstrates is equivalent to rewriting your iteration as:
In [5]: n = 10
...: vec1 = np.random.rand(n)
...: vec2 = np.zeros((n,3))
...: vec3 = np.random.rand(n,3)
...:
...: for i in range(len(vec1)):
...: if vec1[i] < 0.5:
...: vec2[i,:] = vec1[i]*vec3[i,:]
...: else:
...: vec2[i,:] = 0
...:
In [6]: vec2
Out[6]:
array([[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0.119655 , 0.05079028, 0.00392748],
[0.04529872, 0.04630456, 0.01565116],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0.08361475, 0.21825921, 0.1273483 ]])
In [7]: vec1
Out[7]:
array([0.934649 , 0.85309325, 0.50775071, 0.91246865, 0.12970539,
0.13075136, 0.89861756, 0.68921343, 0.80572879, 0.25996369])
By defining vec2 as a (n,3) array, we replace this indexing vec2[3*i : 3*(i+1)] with vec2[i,:] or vec2[i].
Use of a mask to set values to 0 is a good basic numpy idea. But ufunc also provide a where parameter that can be used as:
In [11]: vec2 = np.zeros((n,3))
In [12]: np.multiply(vec1[:,None],vec3, out=vec2, where=vec1[:,None]<0.5);
In [13]: vec2
Out[13]:
array([[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0.119655 , 0.05079028, 0.00392748],
[0.04529872, 0.04630456, 0.01565116],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0. , 0. , 0. ],
[0.08361475, 0.21825921, 0.1273483 ]])
This where needs to be used in conjunction with a out parameter, since it only does the multiply for the True instances.
I'm not sure how much of a time saver it is.
Assuming I have a matrix / array / list like a=[1,2,3,4,5] and I want to nullify all entries except for the max so it would be a=[0,0,0,0,5].
I'm using b = [val if idx == np.argmax(a) else 0 for idx,val in enumerate(a)] but is there a better (and faster) way (especially for more than 1-dim arrays...)
You can use numpy for an in-place solution. Note that the below method will make all matches for the max value equal to 0.
import numpy as np
a = np.array([1,2,3,4,5])
a[np.where(a != a.max())] = 0
# array([0, 0, 0, 0, 5])
For unique maxima, see #cᴏʟᴅsᴘᴇᴇᴅ's solution.
Rather than masking, you can create an array of zeros and set the right index appropriately?
1-D (optimised) Solution
(Setup) Convert a to a 1D array: a = np.array([1,2,3,4,5]).
To replace just one instance of the max
b = np.zeros_like(a)
i = np.argmax(a)
b[i] = a[i]
To replace all instances of the max
b = np.zeros_like(a)
m = a == a.max()
b[m] = a[m]
N-D solution
np.random.seed(0)
a = np.random.randn(5, 5)
b = np.zeros_like(a)
m = a == a.max(1, keepdims=True)
b[m] = a[m]
b
array([[0. , 0. , 0. , 2.2408932 , 0. ],
[0. , 0.95008842, 0. , 0. , 0. ],
[0. , 1.45427351, 0. , 0. , 0. ],
[0. , 1.49407907, 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 2.26975462]])
Works for all instances of max per row.
x has shape [batch_size, n_time] where the batches are independent
If k=3, d=discount_rate. Pseudocode:
x[:,i] = x[:,i] + x[:,i+1]*(d**1) + x[:,i+2]*(d**2) + x[:,i+3]*(d**3)
Here's working code, but it's very slow. I'll be executing this function millions of times, so I'm hoping for a faster implementation
import numpy as np
def k_step_discount(x, k, discount_rate):
n_time = x.shape[1]
k_include_cur = k + 1 # k excludes current timestep
for i in range(n_time):
k_cur = min(n_time - i, k_include_cur) # prevent out of bounds
for j in range(1, k_cur):
x[:, i] += x[:, i+j] * (discount_rate ** j)
return x
x = np.array([
[0,0,0,1,0,0],
[0,1,2,3,4,5.]
])
y = k_step_discount(x+0, k=2, discount_rate=.9)
print('x\n{}\ny\n{}'.format(x, y))
>> x
[[ 0. 0. 0. 1. 0. 0.]
[ 0. 1. 2. 3. 4. 5.]]
>> y
[[ 0. 0.81 0.9 1. 0. 0. ]
[ 2.52 5.23 7.94 10.65 8.5 5. ]]
A scipy function that's similar is:
import scipy.signal
import numpy as np
x = np.array([[0,0,0,1,0,0.]])
discount_rate = .9
y = np.flip(scipy.signal.lfilter([1], [1, -discount_rate], np.flip(x+0, 1), axis=1), 1)
print('x\n{}\ny\n{}'.format(x, y))
>> x
[[ 0. 0. 0. 1. 0. 0.]]
>> y
[[ 0.729 0.81 0.9 1. 0. 0. ]]
However, it discounts until the end of n_time rather than only for k steps
I'm also interested in K-step discounting without batches, if that'd be easier/faster
import numpy as np
def k_step_discount_no_batch(x, k, discount_rate):
n_time = x.shape[0]
k_include_cur = k + 1 # k excludes current timestep
for i in range(n_time):
k_cur = min(n_time - i, k_include_cur) # prevent out of bounds
for j in range(1, k_cur):
x[i] += x[i+j] * (discount_rate ** j)
return x
x = np.array([8,0,0,0,1,2.])
y = k_step_discount_no_batch(x+0, k=2, discount_rate=.9)
print('x\n{}\ny\n{}'.format(x, y))
>> x
[ 8. 0. 0. 0. 1. 2.]
>> y
[ 8. 0. 0.81 2.52 2.8 2. ]
Similar no_batch scipy function
import scipy.signal
import numpy as np
x = np.array([8,0,0,0,1,2.])
discount_rate = .9
y = scipy.signal.lfilter([1], [1, -discount_rate], x[::-1], axis=0)[::-1]
print('x\n{}\ny\n{}'.format(x, y))
>> x
[ 8. 0. 0. 0. 1. 2.]
>> y
[ 9.83708 2.0412 2.268 2.52 2.8 2. ]
You could use 2D convolution here. To get the scaling done properly, we need to create the proper 2D kernel, which would be a flipped version of the powered-scaled numbers of discount_rate. This is in accordance with the definition of convolution, in which kernel is slided in the flipped order against the input data and its elements are scaled with those kernel ones and summed up, as precisely done in this case.
Thus, the implementation would be simply -
from scipy.signal import convolve2d as conv2d
import numpy as np
def k_step_discount(x, k, discount_rate, is_batch=True):
if is_batch:
kernel = discount_rate**np.arange(k+1)[::-1][None]
return conv2d(x,kernel)[:,k:]
else:
kernel = discount_rate**np.arange(k+1)[::-1]
return np.convolve(x, kernel)[k:]
Sample run -
In [190]: x
Out[190]:
array([[ 0., 0., 0., 1., 0., 0.],
[ 0., 1., 2., 3., 4., 5.]])
# Proposed method
In [191]: k_step_discount_conv2d(x, k=2, discount_rate=0.9)
Out[191]:
array([[ 0. , 0.81, 0.9 , 1. , 0. , 0. ],
[ 2.52, 5.23, 7.94, 10.65, 8.5 , 5. ]])
# Original loopy method
In [192]: k_step_discount(x, k=2, discount_rate=.9)
Out[192]:
array([[ 0. , 0.81, 0.9 , 1. , 0. , 0. ],
[ 2.52, 5.23, 7.94, 10.65, 8.5 , 5. ]])
Runtime test
In [206]: x = np.random.randint(0,9,(100,1000)).astype(float)
In [207]: %timeit k_step_discount_conv2d(x, k=2, discount_rate=0.9)
1000 loops, best of 3: 1.27 ms per loop
In [208]: %timeit k_step_discount(x, k=2, discount_rate=.9)
100 loops, best of 3: 4.83 ms per loop
With bigger k's :
In [215]: x = np.random.randint(0,9,(100,1000)).astype(float)
In [216]: %timeit k_step_discount_conv2d(x, k=20, discount_rate=0.9)
100 loops, best of 3: 5.44 ms per loop
In [217]: %timeit k_step_discount(x, k=20, discount_rate=.9)
10 loops, best of 3: 44.8 ms per loop
Thus, expect huge speedups with bigger k's!
Further boost
As suggested by #Eric, we could also leverage scipy.ndimage.filters's 1D convolution here.
For a proper comparison listing both with Scipy's 2D and 1D convolution methods -
from scipy.ndimage.filters import convolve1d as conv1d
def using_conv2d(x, k, discount_rate):
kernel = discount_rate**np.arange(k+1)[::-1][None]
return conv2d(x,kernel)[:,k:]
def using_conv1d(x, k, discount_rate):
kernel = discount_rate**np.arange(k+1)[::-1]
return conv1d(x,kernel, mode='constant', origin=k//2)
Timings -
In [100]: x = np.random.randint(0,9,(100,1000)).astype(float)
In [101]: out1 = using_conv2d(x, k=20, discount_rate=0.9)
...: out2 = using_conv1d(x, k=20, discount_rate=0.9)
...:
In [102]: np.allclose(out1, out2)
Out[102]: True
In [103]: %timeit using_conv2d(x, k=20, discount_rate=0.9)
100 loops, best of 3: 5.27 ms per loop
In [104]: %timeit using_conv1d(x, k=20, discount_rate=0.9)
1000 loops, best of 3: 1.43 ms per loop
I am trying to build the following matrix in Python without using a for loop:
A
[[ 0.1 0.2 0. 0. 0. ]
[ 1. 2. 3. 0. 0. ]
[ 0. 1. 2. 3. 0. ]
[ 0. 0. 1. 2. 3. ]
[ 0. 0. 0. 4. 5. ]]
I tried the fill_diagonal method in NumPy (see matrix B below) but it does not give me the same matrix as shown in matrix A:
B
[[ 1. 0.2 0. 0. 0. ]
[ 0. 2. 0. 0. 0. ]
[ 0. 0. 3. 0. 0. ]
[ 0. 0. 0. 1. 0. ]
[ 0. 0. 0. 4. 5. ]]
Here is the Python code that I used to construct the matrices:
import numpy as np
import scipy.linalg as sp # maybe use scipy to build diagonal matrix?
#---- build diagonal square array using "for" loop
m = 5
A = np.zeros((m, m))
A[0, 0] = 0.1
A[0, 1] = 0.2
for i in range(1, m-1):
A[i, i-1] = 1 # m-1
A[i, i] = 2 # m
A[i, i+1] = 3 # m+1
A[m-1, m-2] = 4
A[m-1, m-1] = 5
print('A \n', A)
#---- build diagonal square array without loop
B = np.zeros((m, m))
B[0, 0] = 0.1
B[0, 1] = 0.2
np.fill_diagonal(B, [1, 2, 3])
B[m-1, m-2] = 4
B[m-1, m-1] = 5
print('B \n', B)
So is there a way to construct a diagonal matrix like the one shown by matrix A without using a for loop?
There are functions for this in scipy.sparse, e.g.:
from scipy.sparse import diags
C = diags([1,2,3], [-1,0,1], shape=(5,5), dtype=float)
C = C.toarray()
C[0, 0] = 0.1
C[0, 1] = 0.2
C[-1, -2] = 4
C[-1, -1] = 5
Diagonal matrices are generally very sparse, so you could also keep it as a sparse matrix. This could even have large efficiency benefits, depending on the application.
The efficiency gains sparse matrices could give you depend very much on matrix size. For a 5x5 array you can't really be bothered I guess. But for larger matrices creating the array could be a lot faster with sparse matrices, illustrated by the following example with an identity matrix:
%timeit np.eye(3000)
# 100 loops, best of 3: 3.12 ms per loop
%timeit sparse.eye(3000)
# 10000 loops, best of 3: 79.5 µs per loop
But the real strength of the sparse matrix data type is shown when you need to do mathematical operations on arrays that are sparse:
%timeit np.eye(3000).dot(np.eye(3000))
# 1 loops, best of 3: 2.8 s per loop
%timeit sparse.eye(3000).dot(sparse.eye(3000))
# 1000 loops, best of 3: 1.11 ms per loop
Or when you need to work with some very large but sparse array:
np.eye(1E6)
# ValueError: array is too big.
sparse.eye(1E6)
# <1000000x1000000 sparse matrix of type '<type 'numpy.float64'>'
# with 1000000 stored elements (1 diagonals) in DIAgonal format>
Notice that the number of 0 is always 3 (or a constant whenever you want to have a diagonal matrix like this):
In [10]:
import numpy as np
A1=[0.1, 0.2]
A2=[1,2,3]
A3=[4,5]
SPC=[0,0,0] #=or use np.zeros #spacing zeros
np.hstack((A1,SPC,A2,SPC,A2,SPC,A2,SPC,A3)).reshape(5,5)
Out[10]:
array([[ 0.1, 0.2, 0. , 0. , 0. ],
[ 1. , 2. , 3. , 0. , 0. ],
[ 0. , 1. , 2. , 3. , 0. ],
[ 0. , 0. , 1. , 2. , 3. ],
[ 0. , 0. , 0. , 4. , 5. ]])
In [11]:
import itertools #A more general way of doing it
np.hstack(list(itertools.chain(*[(item, SPC) for item in [A1, A2, A2, A2, A3]]))[:-1]).reshape(5,5)
Out[11]:
array([[ 0.1, 0.2, 0. , 0. , 0. ],
[ 1. , 2. , 3. , 0. , 0. ],
[ 0. , 1. , 2. , 3. , 0. ],
[ 0. , 0. , 1. , 2. , 3. ],
[ 0. , 0. , 0. , 4. , 5. ]])
I have a large real 1-d data set called r. I would like plot:
mean(log(1+a*r)) vs a, with a > -1 .
This is my code:
rr=pd.read_csv('goog.csv')
dd=rr['Close']
series=pd.Series(dd)
seriespct=series.pct_change()
seriespct[0]=seriespct.mean()
dum1 =[0]*len(dd)
a=1.
a_max = 1.
a_step = 0.01
a = scipy.arange(-3.+a_step, a_max, a_step)
n = len(a)
dum2 =[0]*n
m=len(dd)
for j in range(n):
for i in range(m):
dum1[i]=math.log(1+a[j]*seriespct[i])
dum2[j]=scipy.mean(dum1)
plt.plot(a,dum2)
plt.show()
How can I do this in a more elgant way?
I would recommend this:
plt.plot(a, np.log(1 + r*a[:,None]).mean(1))
This has a big speed advantage because it avoids for-loops, and loops done in numpy are significantly faster in case your dataset is large.
In [49]: a = np.arange(a_step-.3, a_max, a_step)
In [50]: r = np.random.random(100)
In [51]: timeit [scipy.mean(log(1+a[i]*r)) for i in range(len(a))]
100 loops, best of 3: 5.47 ms per loop
In [52]: timeit np.log(1 + r*a[:,None]).mean(1)
1000 loops, best of 3: 384 µs per loop
It works by broadcasting so that a varies along one axis and r along another, then you can take the mean just along the axis that r varies along, so you still have an array that varies with a (and has the same shape as a):
import numpy as np
import matplotlib.pyplot as plt
r = np.random.random(100)
a = 1.
a_max = 1.
a_step = 0.01
a = np.arange(a_step-.3, a_max, a_step)
a.shape
#(129,)
a = a[:,None] #adds a new axis, making this a column vector, same as: a = a.reshape(-1,1)
a.shape
#(129, 1)
(a*r).shape
#(129, 100)
loga = np.log(1 + a*r)
loga.shape
#(129,100)
mloga = loga.mean(axis=1) #take the mean along the 2nd axis where `a` varies
mloga.shape
#(129,)
plt.plot(a, mloga)
plt.show()
ADDENDUM:
To avoid dependency on broadcasting, you can use np.outer:
plt.plot(a, np.log(1 + np.outer(a,r)).mean(1))
Which has no need for reshaping a (skip the step a = a[:,None])
Here's a simpler example, so you can see what's happening:
r = np.exp(np.arange(1,5))
a = np.arange(5)
In [33]: r
Out[33]: array([ 2.71828183, 7.3890561 , 20.08553692, 54.59815003])
In [34]: a
Out[34]: array([0, 1, 2, 3, 4])
In [39]: r*a[:,None]
Out[39]:
# this is 2.7... 7.3... 20.08... 54.5... # times:
array([[ 0. , 0. , 0. , 0. ], # 0
[ 2.71828183, 7.3890561 , 20.08553692, 54.59815003], # 1
[ 5.43656366, 14.7781122 , 40.17107385, 109.19630007], # 2
[ 8.15484549, 22.1671683 , 60.25661077, 163.7944501 ], # 3
[ 10.87312731, 29.5562244 , 80.34214769, 218.39260013]]) # 4
In [40]: np.outer(a,r)
Out[40]:
array([[ 0. , 0. , 0. , 0. ],
[ 2.71828183, 7.3890561 , 20.08553692, 54.59815003],
[ 5.43656366, 14.7781122 , 40.17107385, 109.19630007],
[ 8.15484549, 22.1671683 , 60.25661077, 163.7944501 ],
[ 10.87312731, 29.5562244 , 80.34214769, 218.39260013]])
# this is the mean of each column:
In [41]: (np.outer(a,r)).mean(1)
Out[41]: array([ 0. , 21.19775622, 42.39551244, 63.59326866, 84.79102488])
# and the log of 1 + the above is:
In [42]: np.log(1+(np.outer(a,r)).mean(1))
Out[42]: array([ 0. , 3.09999121, 3.77035604, 4.16811021, 4.4519144 ])
You can use scipy to do means.
You can use matplotlib to do plotting.
import scipy
from matplotlib import pyplot
#convert r from a python list to an 1-D array
r = scipy.array(r)
#edit these
a_max = 100
a_step = 0.1
a = scipy.arange(-1+a_step, a_max, a_step)
n = len(a)
pyplot.plot(a, [scipy.mean(log(1+a[i]*r)) for i in range(n)], 'b-')
pyplot.show()