I am very new to Python (in the past I used Mathematica, Maple, or Matlab scripts). I am very impressed how NumPy can evaluate functions over arrays but having problems trying to implement it in several dimensions. My question is very simple (please don't laugh): is there a more elegant and efficient way to evaluate some function f (which is defined over R^2) without using loops?
import numpy
M=numpy.zeros((10,10))
for i in range(0,10):
for j in range(0,10):
M[i,j]=f(i,j)
return M
The goal when coding with numpy is to implement your computation on the whole array, as much as possible. So if your function is, for example, f(x,y) = x**2 +2*y and you want to apply it to all integer pairs x,y in [0,10]x[0,10], do:
x,y = np.mgrid[0:10, 0:10]
fxy = x**2 + 2*y
If you don't find a way to express your function in such a way, then:
Ask how to do it (and state explicitly the function definition)
use numpy.vectorize
Same example using vectorize:
def f(x,y): return x**2 + 2*y
x,y = np.mgrid[0:10, 0:10]
fxy = np.vectorize(f)(x.ravel(),y.ravel()).reshape(x.shape)
Note that in practice I only use vectorize similarly to python map when the content of the arrays are not numbers. A typical example is to compute the length of all list in an array of lists:
# construct a sample list of lists
list_of_lists = np.array([range(i) for i in range(1000)])
print np.vectorize(len)(list_of_lists)
# [0,1 ... 998,999]
Yes, many numpy functions operate on N-dimensional arrays. Take this example:
>>> M = numpy.zeros((3,3))
>>> M[0][0] = 1
>>> M[2][2] = 1
>>> M
array([[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 1.]])
>>> M > 0.5
array([[ True, False, False],
[False, False, False],
[False, False, True]], dtype=bool)
>>> numpy.sum(M)
2.0
Note the difference between numpy.sum, which operates on N-dimensional arrays, and sum, which only goes 1 level deep:
>>> sum(M)
array([ 1., 0., 1.])
So if you build your function f() out of operations that work on n-dimensional arrays, then f() itself will work on n-dimensional arrays.
You can also use numpy multi-dimension slicing, like below. You just provide slices for each dimension:
arr = np.zeros((5,5)) # 5 rows, 5 columns
# update only first column
arr[:,0] = 1
# update only last row ... same as arr[-1] = 1
arr[-1,:] = 1
# update center
arr[1:-1, 1:-1] = 1
print arr
output:
array([[ 1., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1.]])
A pure python answer, not depending upon numpy tools, is to make the Cartesian Product of two sequences:
from itertools import product
for i, j in product(range(0, 10), range(0, 10)):
M[i,j]=f(i,j)
Edit: Actually, I should have read the question properly. This still uses loops, just one less loop.
Related
I have an initial value problem that needs to be solved; the differential equations are derived from a dictionary that looks like:
eqs = {'a': array([-1., 2., 4., 0., ...]),
'b': array([ 1., -10., 0., 0., ...]),
'c': array([ 0., 3., -4., 0., ...]),
'd': array([ 0., 5., 0., -0., ...]),
...}
The differential equation da/dt is given as -1*[a]+2*[b]+4*[c]+0*[d]....
Using the dictionary above, I write a function dXdt as:
def dXdt (X, t):
sys_a, sys_b, sys_c, sys_d,... = eqs['a'], eqs['b'], eqs['c'], eqs['d'],...
dadt = sys_a[0]*X[0]+sys_a[1]*X[1]+sys_a[2]*X[2]+sys_a[3]*X[3]+...
dbdt = sys_b[0]*X[0]+sys_b[1]*X[1]+sys_b[2]*X[2]+sys_b[3]*X[3]+...
dcdt = sys_c[0]*X[0]+sys_c[1]*X[1]+sys_c[2]*X[2]+sys_c[3]*X[3]+...
dddt = sys_d[0]*X[0]+sys_d[1]*X[1]+sys_d[2]*X[2]+sys_d[3]*X[3]+...
...
return [dadt, dbdt, dcdt, dddt, ...]
The initial conditions are:
X0 = [1, 0, 0, 0, ...]
and the solution is given as:
X = integrate.odeint(dXdt, X0, np.linspace(0,10,11))
This works well for a small system, where I can write the equations by hand. However, I have a system that has ~150 differential equations, and I need to automate the way I write dXdt to be used with scipy.integrate.odeint, given the dictionary of eqs. Is there a way to do so?
Any time something follows a simple linear pattern, you can use an iteration or a comprehension to express it. If you have multiple such patterns, you can just nest them. So this:
sys_a, sys_b, sys_c, sys_d,... = eqs['a'], eqs['b'], eqs['c'], eqs['d'],...
dadt = sys_a[0]*X[0]+sys_a[1]*X[1]+sys_a[2]*X[2]+sys_a[3]*X[3]+...
dbdt = sys_b[0]*X[0]+sys_b[1]*X[1]+sys_b[2]*X[2]+sys_b[3]*X[3]+...
dcdt = sys_c[0]*X[0]+sys_c[1]*X[1]+sys_c[2]*X[2]+sys_c[3]*X[3]+...
dddt = sys_d[0]*X[0]+sys_d[1]*X[1]+sys_d[2]*X[2]+sys_d[3]*X[3]+...
...
[dadt, dbdt, dcdt, dddt, ...]
can be expressed simply as:
[sum(eqs[char][i] * X[i] for i in range(len(X))) for char in eqs.keys()]
I am trying to use np.bmat in my numba-optimized python program. To do so, I have to manually define a jitted function bmat since the native one from numpy is not supported:
#njit
def _bmat_2d(matrices):
arr_rows = []
for row in matrices:
arr_rows.append(np.concatenate(row, axis=-1))
return np.array(np.concatenate(arr_rows, axis=0))
(this code is more or less a simplified copy of the one from numpy)
However:
numba only accepts tuples in input of np.concatenate [1]
numba is very bad at casting arbitrary list to tuples [2]
Do you have any idea for this ?
Refs:
[1] https://github.com/numba/numba/issues/2787
[2] https://github.com/numba/numba/issues/2771
Would the following work for your purposes?
import numpy as np
import numba as nb
#nb.njit
def _bmat_2d(m):
out = np.hstack(m[0])
for row in m[1:]:
x = np.hstack(row)
out = np.vstack((out, x))
return out
A = np.random.randint(10, size=(3,2))
B = np.random.randint(10, size=(3,1))
C = np.random.randint(10, size=(3,3))
D = np.random.randint(10, size=(4,6))
a = np.bmat(((A, B, C), (D,)))
b = _bmat_2d(((A, B, C), (D,)))
print(np.allclose((a, b)) # True
Note that you have to pass in a tuple-of-tuples, rather than a list-of-lists or else you will get a "reflected list" error since Numba in the current version cannot handle list-of-lists.
Since np.hstack was incompatible with numba for me, I had to write my own solution. Maybe some of you find this useful. It's not pretty but it does the job.
This essentially does the same thing as J = np.bmat([[J_1, J_2], [J_3, J_4]]).
Just be sure to change J = np.zeros((8, len(J_1[0])*2)) to fit the output array you want:
import numpy as np
import numba
#numba.njit
def main():
J_1 = np.array([[-64., 25.6, 25.6, 12.8], [25.6, -25.6, 0., 0.], [25.6, 0., -25.6, 0.], [12.8, 0., 0., -652.8]])
J_2 = np.array([[-85.33333333, 34.13333333, 34.13333333, 17.06666667], [34.13333333, -34.13333333, 0., 0.], [34.13333333, 0., -34.13333333, 0.], [17.06666667, 0., 0., -870.4]])
J_3 = np.array([[85.33333333, -34.13333333, -34.13333333, -17.06666667], [-34.13333333, 34.13333333, -0., -0.], [-34.13333333, -0., 34.13333333, -0.], [-17.06666667, -0., -0., 870.4]])
J_4 = np.array([[-64., 25.6, 25.6, 12.8], [25.6, -25.6, 0., 0.], [25.6, 0., -25.6, 0.], [12.8, 0., 0., -652.8]])
J = np.zeros((8, len(J_1[0])*2))
for idx, _ in enumerate(J_1[0]):
J[0][idx], J[1][idx], J[2][idx], J[3][idx], J[4][idx], J[5][idx], J[6][idx], J[7][idx] = J_1[0][idx], J_1[1][idx], J_1[2][idx], J_1[3][idx], J_3[0][idx], J_3[1][idx], J_3[2][idx], J_3[3][idx]
J[0][idx+len(J_1[0])], J[1][idx+len(J_1[0])], J[2][idx+len(J_1[0])], J[3][idx+len(J_1[0])], J[4][idx+len(J_1[0])], J[5][idx+len(J_1[0])], J[6][idx+len(J_1[0])], J[7][idx+len(J_1[0])] = J_2[0][idx], J_2[1][idx], J_2[2][idx], J_2[3][idx], J_4[0][idx], J_4[1][idx], J_4[2][idx], J_4[3][idx]
print(J)
if __name__ == '__main__':
main()
Edit:
A guy helped me on another thread with this simple replacement for np.bmat which works inside numba.njit:
J = np.vstack((np.hstack((J_1, J_2)), np.hstack((J_3, J_4))))
I have an array:
MDP= [[0.705,.655,0.614,0.388],[0.762,None,0.660,-1],[0.812,.868,0.918,+1]]
How can I apply np.around on above array without getting the error for None and -1, +1 values?
TIA
Make sure that you work with a numpy array, not lists of lists:
np.around(np.array(MDP).astype(float))
#array([[ 1., 1., 1., 0.],
# [ 1., nan, 1., -1.],
# [ 1., 1., 1., 1.]])
You can convert the result back to a nested list with .tolist(), if needed.
My solution is to make an exception when the values inside the array are of NoneType. This can be done pretty elegantly via a lambda function.
If your array is 1D:
flex_round = lambda array: [None if x == None else np.round(x) for x in array]
If your array is 2D:
flex_round = lambda array: [[None if x == None else np.round(x) for x in y] for y in array]
Do not forget to add the decimals argument to the np.roud call to precise how many digits should remain after the comma.
I bet I am doing something very simple wrong. I want to start with an empty 2D numpy array and append arrays to it (with dimensions 1 row by 4 columns).
open_cost_mat_train = np.matrix([])
for i in xrange(10):
open_cost_mat = np.array([i,0,0,0])
open_cost_mat_train = np.vstack([open_cost_mat_train,open_cost_mat])
my error trace is:
File "/Users/me/anaconda/lib/python2.7/site-packages/numpy/core/shape_base.py", line 230, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
What am I doing wrong? I have tried append, concatenate, defining the empty 2D array as [[]], as [], array([]) and many others.
You need to reshape your original matrix so that the number of columns match the appended arrays:
open_cost_mat_train = np.matrix([]).reshape((0,4))
After which, it gives:
open_cost_mat_train
# matrix([[ 0., 0., 0., 0.],
# [ 1., 0., 0., 0.],
# [ 2., 0., 0., 0.],
# [ 3., 0., 0., 0.],
# [ 4., 0., 0., 0.],
# [ 5., 0., 0., 0.],
# [ 6., 0., 0., 0.],
# [ 7., 0., 0., 0.],
# [ 8., 0., 0., 0.],
# [ 9., 0., 0., 0.]])
If open_cost_mat_train is large I would encourage you to replace the for loop by a vectorized algorithm. I will use the following funtions to show how efficiency is improved by vectorizing loops:
def fvstack():
import numpy as np
np.random.seed(100)
ocmt = np.matrix([]).reshape((0, 4))
for i in xrange(10):
x = np.random.random()
ocm = np.array([x, x + 1, 10*x, x/10])
ocmt = np.vstack([ocmt, ocm])
return ocmt
def fshape():
import numpy as np
from numpy.matlib import empty
np.random.seed(100)
ocmt = empty((10, 4))
for i in xrange(ocmt.shape[0]):
ocmt[i, 0] = np.random.random()
ocmt[:, 1] = ocmt[:, 0] + 1
ocmt[:, 2] = 10*ocmt[:, 0]
ocmt[:, 3] = ocmt[:, 0]/10
return ocmt
I've assumed that the values that populate the first column of ocmt (shorthand for open_cost_mat_train) are obtained from a for loop, and the remaining columns are a function of the first column, as stated in your comments to my original answer. As real costs data are not available, in the forthcoming example the values in the first column are random numbers, and the second, third and fourth columns are the functions x + 1, 10*x and x/10, respectively, where x is the corresponding value in the first column.
In [594]: fvstack()
Out[594]:
matrix([[ 5.43404942e-01, 1.54340494e+00, 5.43404942e+00, 5.43404942e-02],
[ 2.78369385e-01, 1.27836939e+00, 2.78369385e+00, 2.78369385e-02],
[ 4.24517591e-01, 1.42451759e+00, 4.24517591e+00, 4.24517591e-02],
[ 8.44776132e-01, 1.84477613e+00, 8.44776132e+00, 8.44776132e-02],
[ 4.71885619e-03, 1.00471886e+00, 4.71885619e-02, 4.71885619e-04],
[ 1.21569121e-01, 1.12156912e+00, 1.21569121e+00, 1.21569121e-02],
[ 6.70749085e-01, 1.67074908e+00, 6.70749085e+00, 6.70749085e-02],
[ 8.25852755e-01, 1.82585276e+00, 8.25852755e+00, 8.25852755e-02],
[ 1.36706590e-01, 1.13670659e+00, 1.36706590e+00, 1.36706590e-02],
[ 5.75093329e-01, 1.57509333e+00, 5.75093329e+00, 5.75093329e-02]])
In [595]: np.allclose(fvstack(), fshape())
Out[595]: True
In order for the calls to fvstack() and fshape() produce the same results, the random number generator is initialized in both functions through np.random.seed(100). Notice that the equality test has been performed using numpy.allclose instead of fvstack() == fshape() to avoid the round off errors associated to floating point artihmetic.
As for efficiency, the following interactive session shows that initializing ocmt with its final shape is significantly faster than repeatedly stacking rows:
In [596]: import timeit
In [597]: timeit.timeit('fvstack()', setup="from __main__ import fvstack", number=10000)
Out[597]: 1.4884241055042366
In [598]: timeit.timeit('fshape()', setup="from __main__ import fshape", number=10000)
Out[598]: 0.8819408006311278
I was wondering if there is any function in numpy to determine whether a matrix is Unitary?
This is the function I wrote but it is not working. I would be thankful if you guys can find an error in my function and/or tell me another way to find out if a given matrix is unitary.
def is_unitary(matrix: np.ndarray) -> bool:
unitary = True
n = matrix.size
error = np.linalg.norm(np.eye(n) - matrix.dot( matrix.transpose().conjugate()))
if not(error < np.finfo(matrix.dtype).eps * 10.0 *n):
unitary = False
return unitary
Let's take an obviously unitary array:
>>> a = 0.7
>>> b = (1-a**2)**0.5
>>> m = np.array([[a,b],[-b,a]])
>>> m.dot(m.conj().T)
array([[ 1., 0.],
[ 0., 1.]])
and try your function on it:
>>> is_unitary(m)
Traceback (most recent call last):
File "<ipython-input-28-8dc9ddb462bc>", line 1, in <module>
is_unitary(m)
File "<ipython-input-20-3758c2016b67>", line 5, in is_unitary
error = np.linalg.norm(np.eye(n) - matrix.dot( matrix.transpose().conjugate()))
ValueError: operands could not be broadcast together with shapes (4,4) (2,2)
which happens because
>>> m.size
4
>>> np.eye(m.size)
array([[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]])
If we replace n = matrix.size with len(m) or m.shape[0] or something, we get
>>> is_unitary(m)
True
I might just use
>>> np.allclose(np.eye(len(m)), m.dot(m.T.conj()))
True
where allclose has rtol and atol parameters.
If you are using NumPy's matrix class, there is a property for the Hermitian conjugate, so:
def is_unitary(m):
return np.allclose(np.eye(m.shape[0]), m.H * m)
e.g.
In [79]: P = np.matrix([[0,-1j],[1j,0]])
In [80]: is_unitary(P)
Out[80]: True