Helle I want to do some summation on a numpy array like this
import numpy as np
import sympy as sy
import cv2
i, j = sy.symbols('i j', Integer=True)
#next read some grayscale image to create a numpy array of pixels
a = cv2.imread(filename)
b = sy.summation(sy.summation(a[i][j], (i,0,1)), (j,0,1)) #double summation
but I'm facing with an error. is it possible to handle numpy symbols as numpy arrays'indexes? if not can you sugest me a solution?
Thanks.
You can't use numpy object directly in SymPy expressions, because numpy objects don't know how to deal with symbolic variables.
Instead, create the thing you want symbolically using SymPy objects, and then lambdify it. The SymPy version of a numpy array is IndexedBase, but it seems there is a bug with it, so, since your array is 2-dimensional, you can also use MatrixSymbol.
In [49]: a = MatrixSymbol('a', 2, 2) # Replace 2, 2 with the size of the array
In [53]: i, j = symbols('i j', integer=True)
In [50]: f = lambdify(a, Sum(a[i, j], (i, 0, 1), (j, 0, 1)))
In [51]: b = numpy.array([[1, 2], [3, 4]])
In [52]: f(b)
Out[52]: 10
(also note that the correct syntax for creating integer symbols is symbols('i j', integer=True), not symbols('i j', Integer=True)).
Note that you have to use a[i, j] instead of a[i][j], which isn't supported.
MatrixSymbol is limited to 2-dimensional matrices. To generalize to arrays of
any dimension, you can generate the expression with IndexedBase. lambdify is
currently incompatible with IndexedBase, but it can be used with
DeferredVectors. So the trick is pass a DeferredVector to lambdify:
import sympy as sy
import numpy as np
a = sy.IndexedBase('a')
i, j, k = sy.symbols('i j k', integer=True)
s = sy.Sum(a[i, j, k], (i, 0, 1), (j, 0, 1), (k, 0, 1))
f = sy.lambdify(sy.DeferredVector('a'), s)
b = np.arange(24).reshape(2,3,4)
result = f(b)
expected = b[:2,:2,:2].sum()
assert expected == result
Related
I have a multidimensional numpy array of dtype object, which was filled with other arrays.
As an example, here is a code reproducing that behavior:
arr = np.empty((3,4,2,1), dtype=object)
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
for k in range(arr.shape[2]):
for l in range(arr.shape[3]):
arr[i, j, k, l] = np.random.random(10)
Since all the inside arrays have the same size, I would like in this example to "incorporate" the last level into the array and make it an array of size (3,4,2,1,10).
I cannot really change the above code, so what I am looking for is a clean way (few lines, possibly without for loops) to generate this new array once created.
Thank you.
If I understood well your problem you could use random.random_sample() which should give the same result:
arr = np.random.random_sample((3, 4, 2, 1, 10))
After edit the solution is arr = np.array(arr.tolist())
Just by adding a new for loop :
arr = np.empty((3,4,2,1,10), dtype=object)
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
for k in range(arr.shape[2]):
for l in range(arr.shape[3]):
for m in range(arr.shape[4]):
arr[i, j, k, l, m] = np.random.randint(10)
However, you can one line this code with an optimized numpy function, every random function from numpy has a size parameter to build a array with random number with a particular shape :
arr = np.random.random((3,4,2,1,10))
EDIT :
You can flatten the array, replace every single number by a 1D array of length 10 and then reshape it to your desired shape :
import numpy as np
arr = np.empty((3,4,2,1), dtype=object)
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
for k in range(arr.shape[2]):
for l in range(arr.shape[3]):
arr[i, j, k, l] = np.random.randint(10)
flat_arr = arr.flatten()
for i in range(len(flat_arr)):
flat_arr[i] = np.random.randint(0, high=10, size=(10))
res_arr = flat_arr.reshape((3,4,2,1))
import numpy as np
import scipy.sparse
x = np.random.randint(0, 1000, (1000, 100))
# prob better way to do this
d = np.random.random((1000,1000))
d[d < 0.99] = 0
y = scipy.sparse.csr_matrix(d)
What I would like to do is to create a new matrix z containing the values of y at the indices in x.
ie [0, 0] of z should contain the y[0, x[0, 0]]
[0, 1] of z should contain the y[0, x[0, 1]]
%time for i in range(1000): x[i, y[i]].todense()
~247ms
%time for i in range(1000): np.take(x[i].todense(), y[i])
~150ms
both of the above work, but I am looking for a faster method- this is currently the bottleneck on my code.
Please assume that representing the whole scipy.sparse matrix as dense isn't feasible.
edit:
%time z = np.vstack([q.todense()[0, p] for q, p in zip(x, y)])
is ~110ms
The answer seems to be to use an appropriately shaped broadcasting index, as outlined here: How to generate multi-dimensional 2D numpy index using a sub-index for one dimension
(answer deserves more upvotes)!
%time res = y[np.arange(0, 1000).reshape((-1, 1)), x].todense()
I have for example
import numpy as np
a = np.ones((100, 5, 5))
And I want
d = np.vector_diagonal(a)
assert d.shape == (100, 5)
Where d[i, j] corresponds to a[i, j, j]
How to do this with numpy?
np.diagonal(a, axis1=1, axis2=2)
Just need to select which axes are "the matrix" and which "vectorize the matrices"
The reduction will be done on the selected axes.
I have a matrix A of the shape (N, N, T). Then I have a vector of V shape (N,). I want to perform the following operation A[i, j, ...] = A[i, j, ...]*V[i]/V[j]. I'm doing this with the following loop, but sure there is a way to do it with broadcast.
A = np.random.randint(0, 5, (2, 2, 3))
V = np.array([2, 3])
for i in range(2):
for j in range(2):
A[i, j, ...] *= V[i]
A[i, j, ...] /= V[j]
I've thought about doing it with element-wise multiplication and broadcast of numpy, and I try approaches like A * V[:, None, None] but always got an error.
Is there a more efficient way to do it?
Here's one way to do it -
(A*V[:,None,None])/V[:,None]
Alternatively, in two steps -
A *= V[:,None,None]
A /= V[:,None]
Leverage multi-cores with numexpr -
import numexpr as ne
ne.evaluate('A*V3D/V2D',{'V3D':V[:,None,None],'V2D':V[:,None]})
Note that you might be getting error because you might be doing edits into an int array with float results. So, either convert to float array at the start or write to a new array with the one-step approaches.
Let two ndarrays: A of shape (n, *m), and B of shape (n, ). Is there a way to sort A in-place using the order that would sort B?
Sorting A with B is easy using np.argsort, but this is not done in-place:
A = A[np.argsort(B)]
Comments:
A and B have different dtypes, and A can have more than two dimensions. Hence they can’t be stacked to use ndarray.sort().
A takes up a lot of space, which is why it needs to be sorted in-place. Any solution requiring twice the space occupied by A would therefore defeat this purpose.
The title of this question “Re-arranging numpy array in place” may sound related, but the question itself is not very clear, and the answers do not match my question.
Here is a solution that works by following cycles in the index array. It can optionally be compiled using pythran giving a significant speedup if rows are small (80x for 10 elements) and a small speedup if rows are large (30% for 1000 elements).
To keep it pythran compatible I had to simplify it a bit, so it only accepts 2D arrays and it only sorts along axis 0.
Code:
import numpy as np
#pythran export take_inplace(float[:, :] or int[:, :], int[:])
def take_inplace(a, idx):
n, m = a.shape
been_there = np.zeros(n, bool)
keep = np.empty(m, a.dtype)
for i in range(n):
if been_there[i]:
continue
keep[:] = a[i]
been_there[i] = True
j = i
k = idx[i]
while not been_there[k]:
a[j] = a[k]
been_there[k] = True
j = k
k = idx[k]
a[j] = keep
Sample run using compiled version. As indicated above compilation is only required for small rows, for larger rows pure python should be fast enough.
>>> from timeit import timeit
>>> import numpy as np
>>> import take_inplace
>>>
>>> a = np.random.random((1000, 10))
>>> idx = a[:, 4].argsort()
>>>
>>> take_inplace.take_inplace(a, idx)
>>>
# correct
>>> np.all(np.arange(1000) == a[:, 4].argsort())
True
>>>
# speed
>>> timeit(lambda: take_inplace.take_inplace(a, idx), number=1000)
0.011950935004279017
>>>
# for comparison
>>> timeit(lambda: a[idx], number=1000)
0.02985276997787878
If you can set A beforehand as a structured array whose datatype is composed of a subarray of shape (m, ) and a scalar of the same type (e.g., np.int32), then you can sort it in-place with respect to B. For example:
import numpy as np
B = np.array([3, 1, 2])
A = np.array([[10, 11], [20, 21], [30, 31]])
(n, m) = A.shape
dt = np.dtype([('a', np.int32, (m, )), ('b', int)])
A2 = np.array([(a, b) for a, b in zip(A, B)], dtype=dt)
A2.sort(order='b')
print A2