I am trying to use np.bmat in my numba-optimized python program. To do so, I have to manually define a jitted function bmat since the native one from numpy is not supported:
#njit
def _bmat_2d(matrices):
arr_rows = []
for row in matrices:
arr_rows.append(np.concatenate(row, axis=-1))
return np.array(np.concatenate(arr_rows, axis=0))
(this code is more or less a simplified copy of the one from numpy)
However:
numba only accepts tuples in input of np.concatenate [1]
numba is very bad at casting arbitrary list to tuples [2]
Do you have any idea for this ?
Refs:
[1] https://github.com/numba/numba/issues/2787
[2] https://github.com/numba/numba/issues/2771
Would the following work for your purposes?
import numpy as np
import numba as nb
#nb.njit
def _bmat_2d(m):
out = np.hstack(m[0])
for row in m[1:]:
x = np.hstack(row)
out = np.vstack((out, x))
return out
A = np.random.randint(10, size=(3,2))
B = np.random.randint(10, size=(3,1))
C = np.random.randint(10, size=(3,3))
D = np.random.randint(10, size=(4,6))
a = np.bmat(((A, B, C), (D,)))
b = _bmat_2d(((A, B, C), (D,)))
print(np.allclose((a, b)) # True
Note that you have to pass in a tuple-of-tuples, rather than a list-of-lists or else you will get a "reflected list" error since Numba in the current version cannot handle list-of-lists.
Since np.hstack was incompatible with numba for me, I had to write my own solution. Maybe some of you find this useful. It's not pretty but it does the job.
This essentially does the same thing as J = np.bmat([[J_1, J_2], [J_3, J_4]]).
Just be sure to change J = np.zeros((8, len(J_1[0])*2)) to fit the output array you want:
import numpy as np
import numba
#numba.njit
def main():
J_1 = np.array([[-64., 25.6, 25.6, 12.8], [25.6, -25.6, 0., 0.], [25.6, 0., -25.6, 0.], [12.8, 0., 0., -652.8]])
J_2 = np.array([[-85.33333333, 34.13333333, 34.13333333, 17.06666667], [34.13333333, -34.13333333, 0., 0.], [34.13333333, 0., -34.13333333, 0.], [17.06666667, 0., 0., -870.4]])
J_3 = np.array([[85.33333333, -34.13333333, -34.13333333, -17.06666667], [-34.13333333, 34.13333333, -0., -0.], [-34.13333333, -0., 34.13333333, -0.], [-17.06666667, -0., -0., 870.4]])
J_4 = np.array([[-64., 25.6, 25.6, 12.8], [25.6, -25.6, 0., 0.], [25.6, 0., -25.6, 0.], [12.8, 0., 0., -652.8]])
J = np.zeros((8, len(J_1[0])*2))
for idx, _ in enumerate(J_1[0]):
J[0][idx], J[1][idx], J[2][idx], J[3][idx], J[4][idx], J[5][idx], J[6][idx], J[7][idx] = J_1[0][idx], J_1[1][idx], J_1[2][idx], J_1[3][idx], J_3[0][idx], J_3[1][idx], J_3[2][idx], J_3[3][idx]
J[0][idx+len(J_1[0])], J[1][idx+len(J_1[0])], J[2][idx+len(J_1[0])], J[3][idx+len(J_1[0])], J[4][idx+len(J_1[0])], J[5][idx+len(J_1[0])], J[6][idx+len(J_1[0])], J[7][idx+len(J_1[0])] = J_2[0][idx], J_2[1][idx], J_2[2][idx], J_2[3][idx], J_4[0][idx], J_4[1][idx], J_4[2][idx], J_4[3][idx]
print(J)
if __name__ == '__main__':
main()
Edit:
A guy helped me on another thread with this simple replacement for np.bmat which works inside numba.njit:
J = np.vstack((np.hstack((J_1, J_2)), np.hstack((J_3, J_4))))
Related
I got this working code snippet:
import numpy as np
from matplotlib import pyplot as plt
in_raster = np.random.randn(36, 3, 2151)
matrix = np.reshape(in_raster, [(np.shape(in_raster)[0] * np.shape(in_raster)[1]), np.shape(in_raster)[2]])
# reshaping the matrix to prepare loop
out_raster = np.empty([np.shape(in_raster)[0]/3, np.shape(in_raster)[1]/3, np.shape(in_raster)[2]])
# creating empty output matrix
i = 0
j = 0
while i <= len(in_raster)-9 or j < len(out_raster):
if i % 9 == 0:
avg_in_raster = np.nanmean(matrix[i:i+9, :], axis=0)
out_raster[j] = avg_in_raster
i += 9
j += 1
out_raster = np.reshape(out_raster, [np.shape(out_raster)[0], np.shape(in_raster)[1]/3, np.shape(in_raster)[2]])
# plot example
low = 0
high = 50
for row in range(0, 3):
for col in range(np.shape(in_raster)[1]):
plt.plot(range(low,high), (in_raster[row, col, low:high]))
plt.plot(range(low,high), (out_raster[0,0,low:high]), 'k')
plt.show()
The program averages (aggregates) 3x3 slices of the input matrix (a raster image) and sets up a new one maintainig the dimensionality of the original matrix.
Now I got the feeling that there must be an easier way to achieve this.
Does somebody have an idea how to obtain the same result in a more pythonic way?
Thank you!
To my knowledge, there is no easier or quicker way to perform blockwise averaging. Your code might look big, but most of it is just preparation of arrays and resizing or plotting stuff. Your main function is a well-placed while-loop and the averaging itself you leave to numpy which is already a shortcut and should run quickly.
I don't see any reason to further shorten this, without losing readability.
If you just want to make it look shorter and "more pythonic" but less readable, go for this:
import numpy as np
from matplotlib import pyplot as plt
in_raster = np.random.randn(36, 3, 2151)
size=3
matrix=np.array([in_raster[:,:,i].flatten() for i in np.arange(in_raster.shape[2])]).transpose()
out_raster2 = np.array([np.nanmean(matrix[i:i+size**2, :], axis=0) for i in np.arange(len(matrix)) if not i%size**2]).reshape(np.shape(in_raster)[0]/size, np.shape(in_raster)[1]/size, np.shape(in_raster)[2])
# plot example
low = 0
high = 50
for row in range(0, 3):
for col in range(np.shape(in_raster)[1]):
plt.plot(range(low,high), (in_raster[row, col, low:high]))
plt.plot(range(low,high), (out_raster2[0,0,low:high]), 'k')
plt.show()
#plt.plot((out_raster2-out_raster)[0,0,low:high]) # should be all 0s
#plt.show()
And you could make it a function/method with the attribute size = 3 and quality checks (first and second dimension can be divided by size, etc.).
You should be able to do it by extending the shape in one direction and averaging it in that dimension. Like so:
out_raster1 = np.nanmean(in_raster.reshape(36*3//9, -1, 2151 ), axis=1).reshape(12, 1, -1)
To check for consistency,
>>> out_raster1-out_raster
array([[[ 0., 0., 0., ..., 0., 0., 0.]],
[[ 0., 0., 0., ..., 0., 0., 0.]],
[[ 0., 0., 0., ..., 0., 0., 0.]],
...,
[[ 0., 0., 0., ..., 0., 0., 0.]],
[[ 0., 0., 0., ..., 0., 0., 0.]],
[[ 0., 0., 0., ..., 0., 0., 0.]]])
I bet I am doing something very simple wrong. I want to start with an empty 2D numpy array and append arrays to it (with dimensions 1 row by 4 columns).
open_cost_mat_train = np.matrix([])
for i in xrange(10):
open_cost_mat = np.array([i,0,0,0])
open_cost_mat_train = np.vstack([open_cost_mat_train,open_cost_mat])
my error trace is:
File "/Users/me/anaconda/lib/python2.7/site-packages/numpy/core/shape_base.py", line 230, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
What am I doing wrong? I have tried append, concatenate, defining the empty 2D array as [[]], as [], array([]) and many others.
You need to reshape your original matrix so that the number of columns match the appended arrays:
open_cost_mat_train = np.matrix([]).reshape((0,4))
After which, it gives:
open_cost_mat_train
# matrix([[ 0., 0., 0., 0.],
# [ 1., 0., 0., 0.],
# [ 2., 0., 0., 0.],
# [ 3., 0., 0., 0.],
# [ 4., 0., 0., 0.],
# [ 5., 0., 0., 0.],
# [ 6., 0., 0., 0.],
# [ 7., 0., 0., 0.],
# [ 8., 0., 0., 0.],
# [ 9., 0., 0., 0.]])
If open_cost_mat_train is large I would encourage you to replace the for loop by a vectorized algorithm. I will use the following funtions to show how efficiency is improved by vectorizing loops:
def fvstack():
import numpy as np
np.random.seed(100)
ocmt = np.matrix([]).reshape((0, 4))
for i in xrange(10):
x = np.random.random()
ocm = np.array([x, x + 1, 10*x, x/10])
ocmt = np.vstack([ocmt, ocm])
return ocmt
def fshape():
import numpy as np
from numpy.matlib import empty
np.random.seed(100)
ocmt = empty((10, 4))
for i in xrange(ocmt.shape[0]):
ocmt[i, 0] = np.random.random()
ocmt[:, 1] = ocmt[:, 0] + 1
ocmt[:, 2] = 10*ocmt[:, 0]
ocmt[:, 3] = ocmt[:, 0]/10
return ocmt
I've assumed that the values that populate the first column of ocmt (shorthand for open_cost_mat_train) are obtained from a for loop, and the remaining columns are a function of the first column, as stated in your comments to my original answer. As real costs data are not available, in the forthcoming example the values in the first column are random numbers, and the second, third and fourth columns are the functions x + 1, 10*x and x/10, respectively, where x is the corresponding value in the first column.
In [594]: fvstack()
Out[594]:
matrix([[ 5.43404942e-01, 1.54340494e+00, 5.43404942e+00, 5.43404942e-02],
[ 2.78369385e-01, 1.27836939e+00, 2.78369385e+00, 2.78369385e-02],
[ 4.24517591e-01, 1.42451759e+00, 4.24517591e+00, 4.24517591e-02],
[ 8.44776132e-01, 1.84477613e+00, 8.44776132e+00, 8.44776132e-02],
[ 4.71885619e-03, 1.00471886e+00, 4.71885619e-02, 4.71885619e-04],
[ 1.21569121e-01, 1.12156912e+00, 1.21569121e+00, 1.21569121e-02],
[ 6.70749085e-01, 1.67074908e+00, 6.70749085e+00, 6.70749085e-02],
[ 8.25852755e-01, 1.82585276e+00, 8.25852755e+00, 8.25852755e-02],
[ 1.36706590e-01, 1.13670659e+00, 1.36706590e+00, 1.36706590e-02],
[ 5.75093329e-01, 1.57509333e+00, 5.75093329e+00, 5.75093329e-02]])
In [595]: np.allclose(fvstack(), fshape())
Out[595]: True
In order for the calls to fvstack() and fshape() produce the same results, the random number generator is initialized in both functions through np.random.seed(100). Notice that the equality test has been performed using numpy.allclose instead of fvstack() == fshape() to avoid the round off errors associated to floating point artihmetic.
As for efficiency, the following interactive session shows that initializing ocmt with its final shape is significantly faster than repeatedly stacking rows:
In [596]: import timeit
In [597]: timeit.timeit('fvstack()', setup="from __main__ import fvstack", number=10000)
Out[597]: 1.4884241055042366
In [598]: timeit.timeit('fshape()', setup="from __main__ import fshape", number=10000)
Out[598]: 0.8819408006311278
I was wondering if there is any function in numpy to determine whether a matrix is Unitary?
This is the function I wrote but it is not working. I would be thankful if you guys can find an error in my function and/or tell me another way to find out if a given matrix is unitary.
def is_unitary(matrix: np.ndarray) -> bool:
unitary = True
n = matrix.size
error = np.linalg.norm(np.eye(n) - matrix.dot( matrix.transpose().conjugate()))
if not(error < np.finfo(matrix.dtype).eps * 10.0 *n):
unitary = False
return unitary
Let's take an obviously unitary array:
>>> a = 0.7
>>> b = (1-a**2)**0.5
>>> m = np.array([[a,b],[-b,a]])
>>> m.dot(m.conj().T)
array([[ 1., 0.],
[ 0., 1.]])
and try your function on it:
>>> is_unitary(m)
Traceback (most recent call last):
File "<ipython-input-28-8dc9ddb462bc>", line 1, in <module>
is_unitary(m)
File "<ipython-input-20-3758c2016b67>", line 5, in is_unitary
error = np.linalg.norm(np.eye(n) - matrix.dot( matrix.transpose().conjugate()))
ValueError: operands could not be broadcast together with shapes (4,4) (2,2)
which happens because
>>> m.size
4
>>> np.eye(m.size)
array([[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]])
If we replace n = matrix.size with len(m) or m.shape[0] or something, we get
>>> is_unitary(m)
True
I might just use
>>> np.allclose(np.eye(len(m)), m.dot(m.T.conj()))
True
where allclose has rtol and atol parameters.
If you are using NumPy's matrix class, there is a property for the Hermitian conjugate, so:
def is_unitary(m):
return np.allclose(np.eye(m.shape[0]), m.H * m)
e.g.
In [79]: P = np.matrix([[0,-1j],[1j,0]])
In [80]: is_unitary(P)
Out[80]: True
For my astronomy homework, I need to simulate the elliptical orbit of a planet around a sun. To do this, I need to use a for loop to repeatedly calculate the motion of the planet. However, every time I try to run the program, I get the following error:
RuntimeWarning: invalid value encountered in power
r=(x**2+y**2)**1.5
Traceback (most recent call last):
File "planetenstelsel3-4.py", line 25, in <module>
ax[i] = a(x[i],y[i])*x[i]
ValueError: cannot convert float NaN to integer
I've done some testing, and I think the problem lies in the fact that the values that are calculated are greater than what fits in an integer, and the arrays are defined as int arrays. So if there was a way do define them as float arrays, maybe it would work. Here is my code:
import numpy as np
import matplotlib.pyplot as plt
dt = 3600 #s
N = 5000
x = np.tile(0, N)
y = np.tile(0, N)
x[0] = 1.496e11 #m
y[0] = 0.0
vx = np.tile(0, N)
vy = np.tile(0, N)
vx[0] = 0.0
vy[0] = 28000 #m/s
ax = np.tile(0, N)
ay = np.tile(0, N)
m1 = 1.988e30 #kg
G = 6.67e-11 #Nm^2kg^-2
def a(x,y):
r=(x**2+y**2)**1.5
return (-G*m1)/r
for i in range (0,N):
r = x[i],y[i]
ax[i] = a(x[i],y[i])*x[i]
ay[i] = a(x[i],y[i])*y[i]
vx[i+1] = vx[i] + ax[i]*dt
vy[i+1] = vy[i] + ay[i]*dt
x[i+1] = x[i] + vx[i]*dt
y[i+1] = y[i] + vy[i]*dt
plt.plot(x,y)
plt.show()
The first few lines are just some starting parameters.
Thanks for the help in advance!
When you are doing physics simulations you should definitely use floats for everything. 0 is an integer constant in Python, and thus np.tile creates integer arrays; use 0.0 as the argument to np.tile to do floating point arrays; or preferably use the np.zeros(N) instead:
You can check the datatype of any array variable from its dtype member:
>>> np.tile(0, 10).dtype
dtype('int64')
>>> np.tile(0.0, 10).dtype
dtype('float64')
>>> np.zeros(10)
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
>>> np.zeros(10).dtype
dtype('float64')
To get a zeroed array of float32 you'd need to give a float32 as the argument:
>>> np.tile(np.float32(0), 10)
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)
or, preferably, use zeros with a defined dtype:
>>> np.zeros(10, dtype='float32')
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)
You need x = np.zeros(N), etc.: this declares the arrays as float arrays.
This is the standard way of putting zeros in an array (np.tile() is convenient for creating a tiling with a fixed array).
I am very new to Python (in the past I used Mathematica, Maple, or Matlab scripts). I am very impressed how NumPy can evaluate functions over arrays but having problems trying to implement it in several dimensions. My question is very simple (please don't laugh): is there a more elegant and efficient way to evaluate some function f (which is defined over R^2) without using loops?
import numpy
M=numpy.zeros((10,10))
for i in range(0,10):
for j in range(0,10):
M[i,j]=f(i,j)
return M
The goal when coding with numpy is to implement your computation on the whole array, as much as possible. So if your function is, for example, f(x,y) = x**2 +2*y and you want to apply it to all integer pairs x,y in [0,10]x[0,10], do:
x,y = np.mgrid[0:10, 0:10]
fxy = x**2 + 2*y
If you don't find a way to express your function in such a way, then:
Ask how to do it (and state explicitly the function definition)
use numpy.vectorize
Same example using vectorize:
def f(x,y): return x**2 + 2*y
x,y = np.mgrid[0:10, 0:10]
fxy = np.vectorize(f)(x.ravel(),y.ravel()).reshape(x.shape)
Note that in practice I only use vectorize similarly to python map when the content of the arrays are not numbers. A typical example is to compute the length of all list in an array of lists:
# construct a sample list of lists
list_of_lists = np.array([range(i) for i in range(1000)])
print np.vectorize(len)(list_of_lists)
# [0,1 ... 998,999]
Yes, many numpy functions operate on N-dimensional arrays. Take this example:
>>> M = numpy.zeros((3,3))
>>> M[0][0] = 1
>>> M[2][2] = 1
>>> M
array([[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 1.]])
>>> M > 0.5
array([[ True, False, False],
[False, False, False],
[False, False, True]], dtype=bool)
>>> numpy.sum(M)
2.0
Note the difference between numpy.sum, which operates on N-dimensional arrays, and sum, which only goes 1 level deep:
>>> sum(M)
array([ 1., 0., 1.])
So if you build your function f() out of operations that work on n-dimensional arrays, then f() itself will work on n-dimensional arrays.
You can also use numpy multi-dimension slicing, like below. You just provide slices for each dimension:
arr = np.zeros((5,5)) # 5 rows, 5 columns
# update only first column
arr[:,0] = 1
# update only last row ... same as arr[-1] = 1
arr[-1,:] = 1
# update center
arr[1:-1, 1:-1] = 1
print arr
output:
array([[ 1., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1.]])
A pure python answer, not depending upon numpy tools, is to make the Cartesian Product of two sequences:
from itertools import product
for i, j in product(range(0, 10), range(0, 10)):
M[i,j]=f(i,j)
Edit: Actually, I should have read the question properly. This still uses loops, just one less loop.