I´d like to create a matrix whose elements are all variables, so i tried the following
import sympy as sp
from sympy import MatrixSymbol, Matrix
A=sp.symbols('rho0:'+str(side*(side)/2))
rho = MatrixSymbol('rho', side, side)
rho[0][0]=A[0]
count=0
for i in range(side):
for j in range(i,side):
rho[i][j]=A[count]
rho[j][i]=rho[i][j]
count+=1
Nevertheless it seems the type of matrix I´m using doesn´t support symbols, what should I do ?
MatrixSymbol is used to represent an abstract matrix as a single entity/symbol in a manner like Symbol is used to represent a scalar. If you are just going to use generic symbols in order, perhaps you could use
>>> MatrixSymbol('rho', 2, 2).as_explicit()
Matrix([[rho[0, 0], rho[0, 1]], [rho[1, 0], rho[1, 1]]])
Notice that a normal Matrix of symbols has been produced.
But since you want the matrix to be symmetric you would either have to loop back over the matrix and assign elements below the matrix as you tried in your code above.
Alternatively, you can modify what you did above by using correct Matrix indexing and a Matrix instead of a MatrixSymbol:
import sympy as sp
from sympy import zeros, Matrix
sides = 2
A=sp.symbols('rho0:'+str(side*(side+1)/2)) # note (side+1) not side
rho = zeros(side, side)
rho[0,0]=A[0]
count=0
for i in range(side):
for j in range(i,side):
rho[i,j]=A[count]
rho[j,i]=rho[i,j]
count+=1
If you first make a matrix that is not symmetric then you can make a symmetric matrix out of its elements:
In [23]: M = MatrixSymbol('M', 3, 3)
In [24]: M
Out[24]: M
In [25]: M.as_explicit()
Out[25]:
⎡M₀₀ M₀₁ M₀₂⎤
⎢ ⎥
⎢M₁₀ M₁₁ M₁₂⎥
⎢ ⎥
⎣M₂₀ M₂₁ M₂₂⎦
In [26]: M[0,0]
Out[26]: M₀₀
In [27]: Msym = Matrix(3, 3, lambda i, j: M[min(i,j),max(i,j)])
In [28]: Msym
Out[28]:
⎡M₀₀ M₀₁ M₀₂⎤
⎢ ⎥
⎢M₀₁ M₁₁ M₁₂⎥
⎢ ⎥
⎣M₀₂ M₁₂ M₂₂⎦
In [29]: Msym.is_symmetric()
Out[29]: True
Related
I need to solve x from sparse matrix expression A*x = B in loops, where A is a Scipy CSC sparse-matrix and B is Numpy 1D array. Both A and B are large about 500K rows. Basically, I need to update B in each loop. So the speed to update B is critical. Right now, my way is to define csc_matrix in each loop, and then convert it to 1D Numpy array as below which is really expensive in terms of time:
B = csc_matrix((data,(row, col)),shape=(500000, 1), dtype='complex128').toarray()[:,0];
Please note:
row has lots of the repeated index, such as [0,1,2,0,2,2,3,3....],
col is [0,0, 0,.......0];
Is there fast way to update B in each loop?
Assuming col contains only zeros, data/row/col are Numpy arrays and you want B stored as a Numpy array. You can use Numba to generate B efficiently. Here is how:
import numba
# Works in-place to avoid any slow allocation in the critical loop.
# Note that the type of row may be different.
#nb.njit(void(nb.complex128[:], nb.complex128[:], nb.int64[:]))
def updateVector(B, data, row):
B.fill(0.)
for i in range(len(row)):
B[row[i]] += data[i]
updateVector update the value of B in-place. This assume B has been allocated at the correct size before (using for example B = np.empty(500000, dtype=np.complex128)).
On my machine this is 14 times faster with the following configuration:
row = np.random.randint(0, 500000, size=100000)
col = np.zeros(100000, dtype=np.int64)
data = np.random.rand(100000) + np.random.rand(100000) * 1j
Consider x, an n x 3 vector.
Is it possible, using built-in methods of numpy or tensorflow, or any Python library, to get a vector of the order n x 1 such that each row is a vector of the order 3 x 1? That is, if x is [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]T, can a vector of the form [[1, 2, 3]T, [4, 5, 6]T, [7, 8, 9]T, [10, 11, 12]T]T be got without for loops or introducing new axes like, say, np.newaxis?
The motive behind this is to get only the diagonal elements of the dot product of x and its transpose. We could, of course, do something like np.diag(x.dot(x.T)). But, if n is significantly large, say, 202933, one can hear the CPU's fan suffering from wheezing. How to actually avoid doing the dot product of all the elements and do so of only the diagonal ones of the phantom dot product without iteration?
Let's take a look at the formula for each element in the result of multiplying x by its own transpose. I don't feel like trying to coerce the Stack Overflow UI into allowing me to use tensor notation, so we'll look conceptually.
Each element at row i, column j of the result is the dot product of row i in x and column j in x.T. Now column j in x.T is just row j in x, and the diagonal is where i and j are the same. So what you want is a sum across the rows of the squared elements of x:
d = (x * x).sum(axis=1)
To address the first part of your question, the transpose operation in numpy rarely makes a copy of your data, so x.T or np.transpose(x) are constant-time operations for even the largest arrays. The reason is that numpy arrays are stored as a block of data along with some meta-data like dimensions, strides between elements in each dimension, and data size. Transposing an array only requires you to modify a small amount of meta-data in the array object, like sizes along each dimension and strides, not copy the whole data set.
The time consuming part is performing the multiplication. Simply having the objects x and x.T costs almost nothing: they both use the same data buffer.
This function is likely one of the most efficient ways to handle this. (Taken from trimesh: https://github.com/mikedh/trimesh/blob/main/trimesh/util.py#L589)
def diagonal_dot(a, b):
"""
Dot product by row of a and b.
There are a lot of ways to do this though
performance varies very widely. This method
uses a dot product to sum the row and avoids
function calls if at all possible.
Parameters
------------
a : (m, d) float
First array
b : (m, d) float
Second array
Returns
-------------
result : (m,) float
Dot product of each row
"""
# make sure `a` is numpy array
# doing it for `a` will force the multiplication to
# convert `b` if necessary and avoid function call otherwise
a = np.asanyarray(a)
# 3x faster than (a * b).sum(axis=1)
# avoiding np.ones saves 5-10% sometimes
return np.dot(a * b, [1.0] * a.shape[1])
Comparing performance of some equivalent versions:
In [1]: import numpy as np; import trimesh
In [2]: a = np.random.random((10000, 3))
In [3]: b = np.random.random((10000, 3))
In [4]: %timeit (a * b).sum(axis=1)
1000 loops, best of 3: 181 us per loop
In [5]: %timeit np.einsum('ij,ij->i', a, b)
10000 loops, best of 3: 62.7 us per loop
In [6]: %timeit np.diag(np.dot(a, b.T))
1 loop, best of 3: 429 ms per loop
In [7]: %timeit np.dot(a * b, np.ones(a.shape[1]))
10000 loops, best of 3: 61.3 us per loop
In [8]: %timeit trimesh.util.diagonal_dot(a, b)
10000 loops, best of 3: 55.2 us per loop
According to http://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html, if x and y are given and input arrays are 1-D, where is equivalent to [xv if c else yv for (c,xv, yv) in zip(x!=0, 1/x, x)]. When doing runtime benchmarks, however, they have significantly different speeds:
x = np.array(range(-500, 500))
%timeit np.where(x != 0, 1/x, x)
10000 loops, best of 3: 23.9 µs per loop
%timeit [xv if c else yv for (c,xv, yv) in zip(x!=0, 1/x, x)]
1000 loops, best of 3: 232 µs per loop
Is there a way I can rewrite the second form so that it has a similar runtime to the first? The reason I ask is because I'd like to use a slightly modified version of the second case to avoid division by zero errors:
[1 / xv if c else xv for (c,xv) in zip(x!=0, x)]
Another question: the first case returns a numpy array while the second case returns a list. Is the most efficient way to have the second case return an array is to first make a list and then convert the list to an array?
np.array([xv if c else yv for (c,xv, yv) in zip(x!=0, 1/x, x)])
Thanks!
You just asked about 'delaying' the 'where':
numpy.where : how to delay evaluating parameters?
and someone else just asked about divide by zero:
Replace all elements of a matrix by their inverses
When people say that where is similar to the list comprehension, they attempt to describe the action, not the actual implementation.
np.where called with just one argument is the same as np.nonzero. This quickly (in compiled code) loops through the argument, and collects the indices of all non-zero values.
np.where when called with 3 arguments, returns a new array, collecting values from the 2 and 3rd arguments based on the nonzero values. But it's important to realize that those arguments must be other arrays. They are not functions that it evaluates element by element.
So the where is more like:
m1 = 1/xv
m2 = xv
[v1 if c else v2 for (c, v1, v2) in zip(x!=0, m1, m2)]
It's easy to run this iteration in compiled code because it just involves 3 arrays of matching size (matching via broadcasting).
np.array([...]) is a reasonable way of converting a list (or list comprehension) into an array. It may be a little slower than some alternatives because np.array is a powerful general purpose function. np.fromiter([], dtype) may be faster in some cases, because it isn't as general (you have to specify dtype, and it it only works with 1d).
There are 2 time proven strategies for getting more speed in element-by-element calculations:
use packages like numba and cython to rewrite the problem as c code
rework your calculations to use existing numpy methods. The use of masking to avoid divide by zero is a good example of this.
=====================
np.ma.where, the version for masked arrays is written in Python. Its code might be instructive. Note in particular this piece:
# Construct an empty array and fill it
d = np.empty(fc.shape, dtype=ndtype).view(MaskedArray)
np.copyto(d._data, xv.astype(ndtype), where=fc)
np.copyto(d._data, yv.astype(ndtype), where=notfc)
It makes a target, and then selectively copies values from the 2 inputs arrays, based on the condition array.
You can avoid division by zero while maintaining performance by using advanced indexing:
x = np.arange(-500, 500)
result = np.empty(x.shape, dtype=float) # set the dtype to whatever is appropriate
nonzero = x != 0
result[nonzero] = 1/x[nonzero]
result[~nonzero] = 0
If you for some reason want to bypass an error with numpy it might be worth looking into the errstate context:
x = np.array(range(-500, 500))
with np.errstate(divide='ignore'): #ignore zero-division error
x = 1/x
x[x!=x] = 0 #convert inf and NaN's to 0
Consider changing the array in place by using np.put():
In [56]: x = np.linspace(-1, 1, 5)
In [57]: x
Out[57]: array([-1. , -0.5, 0. , 0.5, 1. ])
In [58]: indices = np.argwhere(x != 0)
In [59]: indices
Out[59]:
array([[0],
[1],
[3],
[4]], dtype=int64)
In [60]: np.put(x, indices, 1/x[indices])
In [61]: x
Out[61]: array([-1., -2., 0., 2., 1.])
The approach above does not create a new array, which could be very convenient if x is a large array.
Working with a sympy Matrix or numpy array of sympy symbols, how does one take the element-wise logarithm?
For example, if I have:
m=sympy.Matrix(sympy.symbols('a b c d'))
Then np.abs(m) works fine, but np.log(m) does not work ("AttributeError: log").
Any solutions?
Use Matrix.applyfunc:
In [6]: M = sympy.Matrix(sympy.symbols('a b c d'))
In [7]: M.applyfunc(sympy.log)
Out[7]:
⎡log(a)⎤
⎢ ⎥
⎢log(b)⎥
⎢ ⎥
⎢log(c)⎥
⎢ ⎥
⎣log(d)⎦
You can't use np.log because that does a numeric log, but you want the symbolic version, i.e., sympy.log.
If you want an elementwise logarithm, and your matrices are all going to be single-column, you should just be able to use a list comprehension:
>>> m = sympy.Matrix(sympy.symbols('a b c d'))
>>> logm = sympy.Matrix([sympy.log(x) for x in m])
>>> logm
Matrix([
[log(a)],
[log(b)],
[log(c)],
[log(d)]])
This is kind of ugly, but you could wrap it in a function for ease, e.g.:
>>> def sp_log(m):
return sympy.Matrix([sympy.log(x) for x in m])
>>> sp_log(m)
Matrix([
[log(a)],
[log(b)],
[log(c)],
[log(d)]])
def evalPolynomial(coeffs,x):
return sum([n for n in coeffs] * [x**(m-1)for m in range(len(coeffs),0,-1)])
TypeError: can't multiply sequence by non-int of type 'list'
Not sure what's causing the error? When I print each of the statements separately, they each give me a list, but when I try to multiply them it doesn't work.
Python list s can only be multiplied by an integer, in which case the elements of the list are repeated:
>>> [1,2,3] * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]
If you want vectorial operations use numpy.ndarray instead:
>>> import numpy as np
>>> ar = np.array([1,2,3])
>>> ar * 3
array([3, 6, 9])
In particular there is a numpy function for convolution(i.e. polynomial multiplication):
>>> a = np.array([1,2,3]) # 1 + 2x + 3x^2
>>> b = np.array([4,5,6]) # 4 + 5x + 6x^2
>>> np.convolve(a, b) # (1 + 2x + 3x^2) * (4 + 5x + 6x^2)
array([ 4, 13, 28, 27, 18]) # 4 + 13x + 28x^2 + 27x^3 + 18x^4
If you want to evaluate a polynomial there is the numpy.polyval function which does this.
Keep in mind that using numpy limits the size of the integers, so you might obtain wrong results if the coefficients are so big that they overflow.
The expression [n for n in coeffs] is a list, of integers.
Lists do support multiplication by an integer, but this means "make a list that
is n copies of the starting list"; this is not what you want in this mathematical context.
I would recommend that you look at the numpy (or scipy which is largely a superset of numpy) package to help with this. It has a function polyval for evaluating exactly what you want, and also provides a class based representation polynomial. In general, for doing numeric computation in Python, you should look at these packages.
But if you want to roll your own, you'll need to do the math inside of the list comprehension,
one way to do it is:
return sum( [ n*x**(i-1) for (n,i) in zip( coeffs, xrange(len(coeffs),0,-1)) ] )
You are trying to multiple two lists together. This is not a valid operation in python.
If you want to multiply each corresponding element in the two lists you can use something like this:
def evalPolynomial(coeffs,x):
return sum(x * y for x, y in zip(coeffs, (x**(m-1)for m in range(len(coeffs),0,-1))))