How to evaluate SymPy expressions with indexed variables using explicit values? - python

Say I have a summation using sympy
from sympy import *
import numpy as np
m = 10
n = IndexedBase('n')
i = symbols("i",cls=Idx)
sum_ = summation(n[i],[i,1,m])
sum_
>>> n[10] + n[1] + n[2] + n[3] + n[4] + n[5] + n[6] + n[7] + n[8] + n[9]
and a numpy array of values
a = np.random.random((m,))
I want to evaluate sum_ using each corresponding value of a - so for example n[1] would be a[0], n[2] would be a[1] and so on. How do I pass the values of a into n?
I have tried using the doit() method, but I am unsure how that works, and keep getting errors.
Furthermore, let's say I have a complicated function which contains sums and that I want to take derivatives of and then evaluate for specific values of the coefficients and variables as below
theta0 = Symbol('theta0')
theta1 = Symbol('theta1')
theta2 = Symbol('theta2')
sigma = Symbol('sigma')
sigma0 = Symbol('sigma0')
sigma1 = Symbol('sigma1')
sigma2 = Symbol('sigma2')
x = IndexedBase('x')
t = IndexedBase('t')
i = symbols("i", cls=Idx)
nges = -(1/(2*sigma**2))*summation( (x[i] - theta0 - theta1*t[i] -
theta2*t[i]**2)**2, [i, 1, 2])
func = (-1/2)*((theta0/sigma0)**2 + (theta1/sigma1)**2 +
(theta2/sigma2)**2) + nges
diff(func, theta0, 1)
>>> -1.0*theta0/sigma0**2 - (4*theta0 + 2*theta1*t[1] + 2*theta1*t[2] + 2*theta2*t[1]**2 + 2*theta2*t[2]**2 - 2*x[1] - 2*x[2])/(2*sigma**2)
How would I pass in scalar values for the theta's and vectors (numpy arrays) for the x's and t's? (I tried using .limit(), but this got cumbersome as I had to call it multiple times on one expression)

The simplest way is to use .subs, passing in a dictionary of substitutions.
sum_.subs({n[i+1]: a[i] for i in range(m)})
In some cases you will want to also invoke evalf to get any symbolic constants like pi evaluated. In this case it's recommended to include substitutions into evalf like this:
sum_.evalf(subs={n[i+1]: a[i] for i in range(m)})
Similarly for your second example. It's more convenient to prepare a dict with values first.
values = {theta0: 0.2, theta1: 0.3, theta2: 1.3, sigma0: 2, sigma: 2.2}
values.update({t[i]: 3*i for i in range(1, 3)})
values.update({x[i]: 5*i for i in range(1, 3)})
diff(func, theta0, 1).subs(values) # 9.67809917355372

Related

What does r() function mean in the return value of SymPy's dsolve?

I want to evaluate the value of phi(+oo)
where phi(xi) is the solution of ODE
Eq(Derivative(phi(xi), (xi, 2)), (-K + xi**2)*phi(xi))
and K is a known real variable.
By dsolve, I got the solution:
Eq(phi(xi), -K*xi**5*r(3)/20 + C2*(K**2*xi**4/24 - K*xi**2/2 + xi**4/12 + 1) + C1*xi*(xi**4/20 + 1) + O(xi**6))
with an unknown function r() in the first term on the right-hand side.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
import sympy
from sympy import I, pi, oo
sympy.init_printing()
def apply_ics(sol, ics, x, known_params):
"""
Apply the initial conditions (ics), given as a dictionary on
the form ics = {y(0): y0, y(x).diff(x).subs(x, 0): yp0, ...},
to the solution of the ODE with independent variable x.
The undetermined integration constants C1, C2, ... are extracted
from the free symbols of the ODE solution, excluding symbols in
the known_params list.
"""
free_params = sol.free_symbols - set(known_params)
eqs = [(sol.lhs.diff(x, n) - sol.rhs.diff(x, n)).subs(x, 0).subs(ics)
for n in range(len(ics))]
sol_params = sympy.solve(eqs, free_params)
return sol.subs(sol_params)
K = sympy.Symbol('K', positive = True)
xi = sympy.Symbol('xi',real = True)
phi = sympy.Function('phi')
ode = sympy.Eq( phi(xi).diff(xi, 2), (xi**2-K)*phi(xi))
ode_sol = sympy.dsolve(ode)
ics = { phi(0):1, phi(xi).diff(xi).subs(xi,0): 0}
phi_xi_sol = apply_ics(ode_sol, ics, xi, [K])
Where ode_sol is the solution, phi_xi_sol is the solution after initial conditions are applied.
Since r() is undefined in NumPy I can't evaluate the results by
for g in [0.9, 0.95, 1, 1.05, 1.2]:
phi_xi = sympy.lambdify(xi, phi_xi_sol.rhs.subs({K:g}), 'numpy')
Does anyone know what this function r() mean and how should I deal with it?
As visible in the form of the result, the solver falls back to a power series solution (instead of searching the solution in terms of parabolic cylinder functions as WolframAlpha does).
So let's set phi(xi)=sum a[k]*xi^k leading to the coefficient equations (using a[k]=0 for k<0)
(k+2)(k+1)a[k+2] = -K*a[k] + a[k-2]
a[0] = C2
a[1] = C1
a[2] = -K/2*C2
a[3] = -K/6*C1
a[4] = (K^2/2 + 1)/12*C2
a[5] = (K^2/6 + 1)/20*C1
Inserting that the the power series solution should have been
C2*(1-K/2*xi**2+(K**2/24+1/12)*xi**4) + C1*xi*(1-K/6*xi**2+(K/120+1/20)*xi**4) + O(xi**6)
Comparing with the sympy solution, all terms containing both C1 and K are missing, especially the missing degree 3 term is not explainable. It seems that the solution process was prematurely ended, or some equation transformation was not correctly reversed.
Please note that the ODE solver routines in sympy are experimental and rudimentary. Also, the power series solution gives only valid information for small values of xi, there is no way to derive any exact value for the limit at +oo.
The sol_params is a list containing a single dictionary. Passing that dictionary instead of the list gives the solution phi_xi_sol without the r(3):
Eq(rho(s), (-K*s**2/2 + s**2*xi**2/2 + 1)*(-6*rho(s) + 6*C2*s - C2*K*s**3 + O(s**5))/
(3*(K*s**2 - 2)) + C2*(-K*s**3/6 + s**3*xi**2/6 + s) + O(s**5))

How can I simplify this more?

I am trying to apply numpy to this code I wrote for trapezium rule integration:
def integral(a,b,n):
delta = (b-a)/float(n)
s = 0.0
s+= np.sin(a)/(a*2)
for i in range(1,n):
s +=np.sin(a + i*delta)/(a + i*delta)
s += np.sin(b)/(b*2.0)
return s * delta
I am trying to get the return value from the new function something like this:
return delta *((2 *np.sin(x[1:-1])) +np.sin(x[0])+np.sin(x[-1]) )/2*x
I am trying for a long time now to make any breakthrough but all my attempts failed.
One of the things I attempted and I do not get is why the following code gives too many indices for array error?
def integral(a,b,n):
d = (b-a)/float(n)
x = np.arange(a,b,d)
J = np.where(x[:,1] < np.sin(x[:,0])/x[:,0])[0]
Every hint/advice is very much appreciated.
You forgot to sum over sin(x):
>>> def integral(a, b, n):
... x, delta = np.linspace(a, b, n+1, retstep=True)
... y = np.sin(x)
... y[0] /= 2
... y[-1] /= 2
... return delta * y.sum()
...
>>> integral(0, np.pi / 2, 10000)
0.9999999979438324
>>> integral(0, 2 * np.pi, 10000)
0.0
>>> from scipy.integrate import quad
>>> quad(np.sin, 0, np.pi / 2)
(0.9999999999999999, 1.1102230246251564e-14)
>>> quad(np.sin, 0, 2 * np.pi)
(2.221501482512777e-16, 4.3998892617845996e-14)
I tried this meanwhile, too.
import numpy as np
def T_n(a, b, n, fun):
delta = (b - a)/float(n) # delta formula
x_i = lambda a,i,delta: a + i * delta # calculate x_i
return 0.5 * delta * \
(2 * sum(fun(x_i(a, np.arange(0, n + 1), delta))) \
- fun(x_i(a, 0, delta)) \
- fun(x_i(a, n, delta)))
Reconstructed the code using formulas at bottom of this page
https://matheguru.com/integralrechnung/trapezregel.html
The summing over the range(0, n+1) - which gives [0, 1, ..., n] -
is implemented using numpy. Usually, you would collect the values using a for loop in normal Python.
But numpy's vectorized behaviour can be used here.
np.arange(0, n+1) gives a np.array([0, 1, ...,n]).
If given as argument to the function (here abstracted as fun) - the function formula for x_0 to x_n
will be then calculated. and collected in a numpy-array. So fun(x_i(...)) returns a numpy-array of the function applied on x_0 to x_n. This array/list is summed up by sum().
The entire sum() is multiplied by 2, and then the function value of x_0 and x_n subtracted afterwards. (Since in the trapezoid formula only the middle summands, but not the first and the last, are multiplied by 2). This was kind of a hack.
The linked German page uses as a function fun(x) = x ^ 2 + 3
which can be nicely defined on the fly by using a lambda expression:
fun = lambda x: x ** 2 + 3
a = -2
b = 3
n = 6
You could instead use a normal function definition, too: defun fun(x): return x ** 2 + 3.
So I tested by typing the command:
T_n(a, b, n, fun)
Which correctly returned:
## Out[172]: 27.24537037037037
For your case, just allocate np.sin tofun and your values for a, b, and n into this function call.
Like:
fun = np.sin # by that eveywhere where `fun` is placed in function,
# it will behave as if `np.sin` will stand there - this is possible,
# because Python treats its functions as first class citizens
a = #your value
b = #your value
n = #your value
Finally, you can call:
T_n(a, b, n, fun)
And it will work!

Finding a way to replace a column in an np.ogrid with a different formula for a specific value of the iterable

a,b=np.ogrid[0:n:1,0:n:1]
A=np.exp(1j*(np.pi/3)*np.abs(a-b))
a,b=np.diag_indices_from(A)
A[a,b]=1-1j/np.sqrt(3)
is my basis. it produces a grid which acts as an n*n matrix.
My issue is I need to replace a column in the grid, say for example where b=17.
I need for this column to be:
A=np.exp(1j*(np.pi/3)*np.abs(a-17+geo_mean(x)))
except for where a=b where it needs to stay as:
A[a,b]=1-1j/np.sqrt(3)
geo_mean(x) is just a geometric average of 50 values determined from a pseudo random number generator, defined in my code as:
x=[random.uniform(0,0.5) for p in range(0,50)]
def geo_mean(iterable):
a = np.array(iterable)
return a.prod()**(1.0/len(a))
So how do i go about replacing a column to include the geo_mean in the exponent formula and do it without changing the diagonal value?
Let's start by saying that diag_indices_from() is kind of useless here since we already know that diagonal elements are those that have equal indices i and j and run up to value n. Therefore, let's simplify the code a little bit at the beginning:
a, b = np.ogrid[0:n:1, 0:n:1]
A = np.exp(1j * (np.pi / 3) * np.abs(a - b))
diag = np.arange(n)
A[diag, diag] = 1 - 1j / np.sqrt(3)
Now, let's say you would like to set the column k values, except for the diagonal element, to
np.exp(1j * (np.pi/3) * np.abs(a - 17 + geo_mean(x)))
(I guess a in the above formula is row index).
This can be done using integer indices, especially that they are almost computed: we already have diag and we just need to remove from it the index of the diagonal element that needs to be kept unchanged:
r = np.delete(diag, k)
Then
x = np.random.uniform(0, 0.5, (r.size, 50))
A[r, k] = np.exp(1j * (np.pi/3) * np.abs(r - k + geo_mean(x)))
However, for the above to work, you need to rewrite your geo_mean() function in a such a way that it will work with 2D input arrays (I will also add some checks and conversions to make it backward compatible):
def geo_mean(x):
x = np.asarray(x)
dim = len(x.shape)
x = np.atleast_2d(x)
v = np.prod(x, axis=1) ** (1.0 / x.shape[1])
return v[0] if dim == 1 else v

Vectorize the midpoint rule for integration

I need some help with this problem.
The midpoint rule for approximating an integral can be expressed as:
h * summation of f(a -(0.5 * h) + i*h)
where h = (b - a)/2
Write a function midpointint(f,a,b,n) to compute the midpoint rule using the numpy sum function.
Make sure your range is from 1 to n inclusive. You could use a range and convert it to an array.
for midpoint(np.sin,0,np.pi,10) the function should return 2.0082
Here is what I have so far
import numpy as np
def midpointint(f,a,b,n):
h = (b - a) / (float(n))
for i in np.array(range(1,n+1)):
value = h * np.sum((f(a - (0.5*h) + (i*h))))
return value
print(midpointint(np.sin,0,np.pi,10))
My code is not printing out the correct output.
Issue with the posted code was that we needed accumulation into output : value += .. after initializing it as zero at the start.
You can vectorize by using a range array for the iterator, like so -
I = np.arange(1,n+1)
out = (h*np.sin(a - (0.5*h) + (I*h))).sum()
Sample run -
In [78]: I = np.arange(1,n+1)
In [79]: (h*np.sin(a - (0.5*h) + (I*h))).sum()
Out[79]: 2.0082484079079745

What is the most pythonic way to conditionally compute?

I'm implementing Bayesian Changepoint Detection in Python/NumPy (if you are interested have a look at the paper). I need to compute likelihoods for data in ranges [a, b], where a and b can have all values from 1 to n. However I can prune the computation at some points, so that I don't have to compute every likelihood. On the other hand some likelihoods are used more than once, so that I can save time by saving the values in a matrix P[a, b]. Right now I check whether the value is already computed, whenever I use it, but I find that a bit of a hassle. It looks like this:
# ...
P = np.ones((n, n)) * np.inf # a likelihood can't get inf, so I use it
# as pseudo value
for a in range(n):
for b in range(a, n):
# The following two lines get annoying and error prone if you
# use P more than once
if P[a, b] == np.inf:
P[a, b] = likelihood(data, a, b)
Q[a] += P[a, b] * g[a] * Q[a - 1] # some computation using P[a, b]
# ...
I wonder, whether there is a more intuitive and pythonic way to achieve this, without having the if ... statement before every use of a P[a, b]. Something like an automagical function call if some condition is not met. I could of course make the likelihood function aware of the fact that it could save values, but then it needs some kind of state (e.g. becomes an object). I want to avoid that.
The likelihood function
Since it was asked for in a comment, I add the likelihood function. It actually computes the conjugate prior and then the likelihood. And all in log representation... So it is quite complicated.
from scipy.special import gammaln
def gaussian_obs_log_likelihood(data, t, s):
n = s - t
mean = data[t:s].sum() / n
muT = (n * mean) / (1 + n)
nuT = 1 + n
alphaT = 1 + n / 2
betaT = 1 + 0.5 * ((data[t:s] - mean) ** 2).sum() + ((n)/(1 + n)) * (mean**2 / 2)
scale = (betaT*(nuT + 1))/(alphaT * nuT)
# splitting the PDF of the student distribution up is /much/ faster. (~ factor 20)
prob = 1
for yi in data[t:s]:
prob += np.log(1 + (yi - muT)**2/(nuT * scale))
lgA = gammaln((nuT + 1) / 2) - np.log(np.sqrt(np.pi * nuT * scale)) - gammaln(nuT/2)
return n * lgA - (nuT + 1)/2 * prob
Although I work with Python 2.7, both answers for 2.7 and 3.x are appreciated.
I would use a sibling of defaultdict for this (you can't use defaultdict directly since it won't tell you the key that is missing):
class Cache(object):
def __init__(self):
self.cache = {}
def get(self, a, b):
key = (a,b)
result = self.cache.get(key, None)
if result is None:
result = likelihood(data, a, b)
self.cache[key] = result
return result
Another approach would be using a cache decorator on likelihood as described here.

Categories

Resources