Substituting values in SymPy summation - python

When substituting values into a SymPy sum, it doesn't seem to recognise that the variables are indexed, and simply factors out all the indexed variables, like so:
# Define variables.
z_tilde_i = sympy.IndexedBase('\\tilde{z}')
rho_i = sympy.IndexedBase('\\rho')
M = sympy.symbols('M')
n = sympy.symbols('n', integer = True)
i = sympy.Idx('i', n)
# Define equation M = sum(rho * deltaZ).
eq_total_mass = sympy.Eq(M, sympy.Sum(rho_i[i] * (z_tilde_i[i + 1] - z_tilde_i[i]), (i, 0, n - 1)))
# Try to substitute values.
print(eq_total_mass.rhs.subs(n, 3).doit())
>>> 3*(\tilde{z}[i + 1] - \tilde{z}[i])*\rho[i]
How to make the SymPy sum recognise the indexed variables?

For a workaround:
There is no need to define i as Idx:
>>> i = var('i')
>>> Sum(rho_i[i] * (z_tilde_i[i + 1] - z_tilde_i[i]), (i, 0, 1)).doit()
(-\tilde{z}[0] + \tilde{z}[1])*\rho[0] + (-\tilde{z}[1] + \tilde{z}[2])*\rho[1]
Or if you do, don't use the integer=True when defining n:
>>> n = var('n')
>>> i = sympy.Idx('i', n)
>>> Sum(rho_i[i] * (z_tilde_i[i + 1] - z_tilde_i[i]), (i, 0, 1)).doit()
(-\tilde{z}[0] + \tilde{z}[1])*\rho[0] + (-\tilde{z}[1] + \tilde{z}[2])*\rho[1]

Related

SymPy division doesn't cancel what it can when using symbolic denominator

I have some code using sympy.solvers.solve() that basically leads to the following:
>>> k, u, p, q = sympy.symbols('k u p q')
>>> solution = (k*u + p*u + q)/(k+p)
>>> solution.simplify()
(k*u + p*u + q)/(k + p)
Now, my problem is that it is not simplified enough/correctly. It should be giving the following:
q/(k + p) + u
From the original equation q = (k + p)*(m - u) this is more obvious (when you solve it manually, which my students will be doing).
I have tried many combinations of sol.simplify(), sol.cancel(), sol.collect(u) but I haven't found what can make it work (btw, the collect I can't really use, as I won't know beforehand which symbol will have to be collected, unless you can make something that collects all the symbols in the solution).
I am working with BookWidgets, which automatically corrects the answers that students give, which is why it's important that I have an output which will match what the students will enter.
First things first:
there is no "standard" output to a simplification step.
if the output of a simplification step doesn't suit your need, you might want to manipulate the expression with simplify, expand, collect, ...
two or more sequences of operations (simplify, expand, collect, ...) might lead to different results, or might lead to the same results. It depends on the expression being manipulated.
Let me show you with your example:
k, u, p, q = symbols('k u p q')
solution = (k*u + p*u + q)/(k+p)
# out1: (k*u + p*u + q)/(k + p)
solution = solution.collect(u)
# out2: (q + u*(k + p))/(k + p)
num, den = fraction(solution)
# use the linearity of addition
solution = Add(*[t / den for t in num.args])
# out3: q/(k + p) + u
In the above code, out1, out2, out3 are mathematically equivalent.
Instead of spending time to simplify outputs, I would test for mathematical equivalence with the equals method. For example:
verified_solution = (k*u + p*u + q)/(k+p)
num, den = fraction(verified_solution)
first_studend_sol = Add(*[t / den for t in num.args])
print(verified_solution.equals(first_studend_sol))
# True
second_student_solution = q/(k + p) + u
print(verified_solution.equals(second_student_solution))
# True
third_student_solution = q/(k + p) + u + 2
print(verified_solution.equals(third_student_solution))
# False
It looks like you want the expression in quotient/remainder form:
>>> n, d = solution.as_numer_denom()
>>> div(n, d)
(u, q)
>>> _[0] + _[1]/d
q/(k + p) + u
But that SymPy function may give unexpected results when the symbol names are changed as described here. Here is an alternative (for which I did not find and existing function in SymPy) that attempts more a synthetic division result:
def sdiv(p, q):
"""return w, r if p = w*q + r else 0, p
Examples
========
>>> from sympy.abc import x, y
>>> sdiv(x, x)
(1, 0)
>>> sdiv(x, y)
(0, x)
>>> sdiv(2*x + 3, x)
(2, 3)
>>> a, b=x + 2*y + z, x + y
>>> sdiv(a, b)
(1, y + z)
>>> sdiv(a, -b)
(-1, y + z)
>>> sdiv(-a, -b)
(1, -y - z)
>>> sdiv(-a, b)
(-1, -y - z)
"""
from sympy.core.function import _mexpand
P, Q = map(lambda i: _mexpand(i, recursive=True), (p, q))
r, wq = P.as_independent(*Q.free_symbols, as_Add=True)
# quick exit if no full division possible
if Q.is_Add and not wq.is_Add:
return S.Zero, P
# check multiplicative cancellation
w, bot = fraction((wq/Q).cancel())
if bot != 1 and wq.is_Add and Q.is_Add:
# try maximal additive extraction
s1 = s2 = 1
if signsimp(Q, evaluate=False).is_Mul:
wq = -wq
r = -r
Q = -Q
s1 = -1
if signsimp(wq, evaluate=False).is_Mul:
wq = -wq
s2 = -1
xa = wq.extract_additively(Q)
if xa:
was = wq.as_coefficients_dict()
now = xa.as_coefficients_dict()
dif = {k: was[k] - now.get(k, 0) for k in was}
n = min(was[k]//dif[k] for k in dif)
dr = wq - n*Q
w = s2*n
r = s1*(r + s2*dr)
assert _mexpand(p - (w*q + r)) == 0
bot = 1
return (w, r) if bot == 1 else (S.Zero, p)
The more general suggestion from Davide_sd about using equals is good if you are only testing the equality of two expressions in different forms.

Implementing Smith-Waterman algorithm for local alignment in python

I have created a sequence alignment tool to compare two strands of DNA (X and Y) to find the best alignment of substrings from X and Y. The algorithm is summarized here (https://en.wikipedia.org/wiki/Smith–Waterman_algorithm). I have been able to generate a lists of lists, filling them all with zeros, to represent my matrix. I created a scoring algorithm to return a numerical score for each kind of alignment between bases (eg. plus 4 for a match). Then I created an alignment algorithm that should put a score in each coordinate of my "matrix". However, when I go to print the matrix, it only returns the original with all zeros (rather than actual scores).
I know there are other methods of implementing this method (with numpy for example), so could you please tell me why this specific code (below) does not work? Is there a way to modify it, so that it does work?
code:
def zeros(X: int, Y: int):
lenX = len(X) + 1
lenY = len(Y) + 1
matrix = []
for i in range(lenX):
matrix.append([0] * lenY)
def score(X, Y):
if X[n] == Y[m]: return 4
if X[n] == '-' or Y[m] == '-': return -4
else: return -2
def SmithWaterman(X, Y, score):
for n in range(1, len(X) + 1):
for m in range(1, len(Y) + 1):
align = matrix[n-1, m-1] + (score(X[n-1], Y[m-1]))
indelX = matrix[n-1, m] + (score(X[n-1], Y[m]))
indelY = matrix[n, m-1] + (score(X[n], Y[m-1]))
matrix[n, m] = max(align, indelX, indelY, 0)
print(matrix)
zeros("ACGT", "ACGT")
output:
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
The reason it's just printing out the zeroed out matrix is that the SmithWaterman function is never called, so the matrix is never updated.
You would need to do something like
# ...
SmithWaterman(X, Y, score)
print(matrix)
# ...
However, If you do this, you will find that this code is actually quite broken in many other ways. I've gone through and annotated some of the syntax errors and other issues with the code:
def zeros(X: int, Y: int):
# ^ ^ incorrect type annotations. should be str
lenX = len(X) + 1
lenY = len(Y) + 1
matrix = []
for i in range(lenX):
matrix.append([0] * lenY)
# A more "pythonic" way of expressing the above would be:
# matrix = [[0] * len(Y) + 1 for _ in range(len(x) + 1)]
def score(X, Y):
# ^ ^ shadowing variables from outer scope. this is not a bug per se but it's considered bad practice
if X[n] == Y[m]: return 4
# ^ ^ variables not defined in scope
if X[n] == '-' or Y[m] == '-': return -4
# ^ ^ variables not defined in scope
else: return -2
def SmithWaterman(X, Y, score): # this function is never called
# ^ unnecessary function passed as parameter. function is defined in scope
for n in range(1, len(X) + 1):
for m in range(1, len(Y) + 1):
align = matrix[n-1, m-1] + (score(X[n-1], Y[m-1]))
# ^ invalid list lookup. should be: matrix[n-1][m-1]
indelX = matrix[n-1, m] + (score(X[n-1], Y[m]))
# ^ out of bounds error when m == len(Y)
indelY = matrix[n, m-1] + (score(X[n], Y[m-1]))
# ^ out of bounds error when n == len(X)
matrix[n, m] = max(align, indelX, indelY, 0)
# this should be nested in the inner for-loop. m, n, indelX, and indelY are not defined in scope here
print(matrix)
zeros("ACGT", "ACGT")

How to make nested list behave like numpy array?

I'm trying to implements an algorithm to count subsets with given sum in python which is
import numpy as np
maxN = 20
maxSum = 1000
minSum = 1000
base = 1000
dp = np.zeros((maxN, maxSum + minSum))
v = np.zeros((maxN, maxSum + minSum))
# Function to return the required count
def findCnt(arr, i, required_sum, n) :
# Base case
if (i == n) :
if (required_sum == 0) :
return 1
else :
return 0
# If the state has been solved before
# return the value of the state
if (v[i][required_sum + base]) :
return dp[i][required_sum + base]
# Setting the state as solved
v[i][required_sum + base] = 1
# Recurrence relation
dp[i][required_sum + base] = findCnt(arr, i + 1, required_sum, n) + findCnt(arr, i + 1, required_sum - arr[i], n)
return dp[i][required_sum + base]
arr = [ 2, 2, 2, 4 ]
n = len(arr)
k = 4
print(findCnt(arr, 0, k, n))
And it gives the expected result, but I was asked to not use numpy, so I replaced numpy arrays with nested lists like this :
#dp = np.zeros((maxN, maxSum + minSum)) replaced by
dp = [[0]*(maxSum + minSum)]*maxN
#v = np.zeros((maxN, maxSum + minSum)) replaced by
v = [[0]*(maxSum + minSum)]*maxN
but now the program always gives me 0 in the output, I think this is because of some behavior differences between numpy arrays and nested lists, but I don't know how to fix it
EDIT :
thanks to #venky__ who provided this solution in the comments :
[[0 for i in range( maxSum + minSum)] for i in range(maxN)]
and it worked, but I still don't understand what is the difference between it and what I was doing before, I tried :
print( [[0 for i in range( maxSum + minSum)] for i in range(maxN)] == [[0]*(maxSum + minSum)]*maxN )
And the result is True, so how this was able to fix the problem ?
It turns out that I was using nested lists the wrong way to represent 2d arrays, since python was not crating separate objets, but the same sub list indexes was referring to the same integer object, for better explanation please read this.

Solving KKT equations in SymPy

I am trying to solve KKT equations using sympy. All of the equations are symbolic and contain constants that are not given as numbers but as symbols. Alongside with the equations, there are also inequality constraints.
Is it possible to do this in sympy? If not, are there any alternatives?
An example would be:
Doing your question by hand, we get the first Lagrangian to be L = x**2 - bx + 1 - lambda(x - a).
Differentiating with respect to x and lambda and setting to 0, we solve to get x = a, lambda = 2a - b.
So if lambda < 0, we repeat with the new Lagrangian, L = x**2 - bx + 1.
This gives us 2 cases: If 2a - b < 0, then x = b/2, otherwise, x = a.
The following code reproduces this logic. Copy paste this into a file called lagrangian.py and see example 4 for your specific query.
"""
KKT Conditions:
Goal:
To minimise/maximise f(x_) subject to gi(x_) >= 0 for all i and hi(x_) == 0 for all i
where x_ refers to a vector x.
Variables with a `*` after them are optimal quantities.
1. gi(x*) and hi(x*) is feasible (that is, they are satisfied)
2. (df/dxj)(x*) - sum(lambdai* * (dgi/dxj)(x*)) - sum(mui* * (dhi/dxj)(x*)) = 0 for all j
3. lambdai* * gi(x*) = 0 for all i
4. lambdai >= 0 for all i
"""
from sympy import *
from typing import Iterable
def kkt(f, g=None, h=None, x=None):
"""
Finds the optimal values of `x` for `f` given the equalities `g[i] >= 0` for all i
and `h[i] == 0` for all i.
TODO: Remove the private variables _lambda and _mu from the output.
TODO: Make the output more user friendly.
Examples:
>>> from sympy import *
>>> from lagrangian import kkt
>>> x = list(symbols("x:3"))
Example 0 the most basic
>>> kkt(x[0]**2)
([x0], FiniteSet((0,)))
Example 1 from References
>>> kkt(2 * x[0] ** 2 + x[1] ** 2 + 4 * x[2]**2,
... h=[x[0] + 2*x[1] - x[2] - 6, 2 * x[0] - 2 * x[1] + 3 * x[2] - 12])
([_mu0, _mu1, x0, x1, x2], FiniteSet((504/67, 424/67, 338/67, 80/67, 96/67)))
Example 2 from References
>>> kkt(x[0] ** 2 + 2 * x[1] ** 2 + 3 * x[2] ** 2,
... [5 * x[0] - x[1] - 3 * x[2] - 3,
... 2 * x[0] + x[1] + 2 * x[2] - 6])
([_lambda0, x0, x1, x2], FiniteSet((72/35, 72/35, 18/35, 24/35)))
Example 3 from References
>>> kkt(4 * x[0] ** 2 + 2 * x[1] ** 2,
... [-2 * x[0] - 4 * x[1] + 15],
... [3 * x[0] + x[1] - 8])
([_mu0, x0, x1], FiniteSet((64/11, 24/11, 16/11)))
Example 4 for general KKT
>>> t, a, b = symbols("x a b")
>>> kkt(t ** 2 - b * t + 1, t - a, x=t)
Piecewise((([x], FiniteSet((b/2,))), 2*a - b < 0), (([_lambda0, x], FiniteSet((2*a - b, a))), True))
Warnings:
This function uses recursion and if queries are such as example 4 with many inequality
conditions, one will experience heavy performance issues.
References:
.. [1] http://apmonitor.com/me575/index.php/Main/KuhnTucker
Disadvantages:
- Does not allow for arbitrary number of inequalities and equalities.
- Does not work for functions of f that have an infinite
number of turning points and no equality constraints (I think)
"""
# begin sanity checks
if not g:
g = []
if not h:
h = []
if not x:
x = list(f.free_symbols)
if not isinstance(g, Iterable):
g = [g]
if not isinstance(h, Iterable):
h = [h]
if not isinstance(x, Iterable):
x = [x]
# end sanity checks
def grad(func):
"""Returns the grad of an expression or function `func`"""
grad_f = Matrix(len(x), 1, derive_by_array(func, x))
return grad_f
# define our dummy variables for the inequalities
if g:
_lambdas = list(symbols(f"_lambda:{len(g)}", real=True))
sum_g = sum([_lambdas[i] * g[i] for i in range(len(g))])
else:
_lambdas = []
sum_g = 0
# define our dummy variables for the equalities
if h:
_mus = list(symbols(f"_mu:{len(h)}", real=True))
sum_h = sum([_mus[i] * h[i] for i in range(len(h))])
else:
_mus = []
sum_h = 0
# define the lagrangian
lagrangian = f - sum_g - sum_h
# find grad of lagrangian
grad_l = grad(lagrangian)
# 1. feasibility conditions
feasible = Matrix(len(g) + len(h), 1, g + h)
# 2. combine everything into a vector equation
eq = grad_l.row_insert(0, feasible)
solution = solve(eq, x + _lambdas + _mus, set=True)
# 4. remove all non-binding inequality constraints
# for each _lambda solution, add a new solution with that inequality removed
# in the case that it is false.
pieces = []
for i, lamb in enumerate(_lambdas):
new_g = g[:i] + g[i + 1:]
lamb_sol_index = [i for i, s in enumerate(solution[0]) if s == lamb][0]
for solution_piece in solution[1]:
lamb_sol = solution_piece[lamb_sol_index]
try:
if lamb_sol >= 0: # we dont need to check the next kkt if the value is known.
continue
except TypeError: # error when inequality cannot be identified
pass
pieces.append((kkt(f, new_g, h, x), lamb_sol < 0))
pieces.append((solution, True))
return Piecewise(*pieces)

How to determine the arrangement of the polynomial when displaying it with latex?

I am not sure if it is an issue with my python code or with the latex but it keeps rearranging my equation in the output.
Code:
ddx = '\\frac{{d}}{{dx}}'
f = (a * x ** m) + (b * x ** n) + d
df = sym.diff(f)
df_string = tools.polytex(df)
f_string = tools.polytex(f)
question_stem = f"Find $_\\displaystyle {ddx}\\left({f_string}\\right)$_"
output:
In this case a = 9, b = -4, c = 4, m = (-1/2), n = 3 and I want the output to be in the order of the variable f.
I have tried changing the order to 'lex' and that did not work nor did .expand() or mode = equation
There is an order option for the StrPrinter. If you set the order to 'none' and then pass an unevaluated Add to _print_Add you can get the desired result.
>>> from sympy.abc import a,b,c,x,m,n
>>> from sympy import S
>>> oargs = Tuple(a * x ** m, b * x ** n, c) # in desired order
>>> r = {a: 9, b: -4, c: 4, m: -S.Half, n: 3}
>>> add = Add(*oargs.subs(r).args, evaluate=False) # arg order unchanged
>>> StrPrinter({'order':'none'})._print_Add(add)
9/sqrt(x) - 4*x**3 + 4
Probably this will not be possible in general, as SymPy expressions get reordered with every manipulation, and even with just converting the expression to the internal format.
Here is some code that might work for your specific situation:
from sympy import *
from functools import reduce
a, b, c, m, n, x = symbols("a b c m n x")
f = (a * x ** m) + (b * x ** n) + c
a = 9
b = -4
c = 4
m = -Integer(1)/2
n = 3
repls = ('a', latex(a)), ('+ b', latex(b) if b < 0 else "+"+latex(b)), \
('+ c', latex(c) if c < 0 else "+"+latex(c)), ('m', latex(m)), ('n', latex(n))
f_tex = reduce(lambda a, kv: a.replace(*kv), repls, latex(f))
# only now the values of the variables are filled into f, to be used in further manipulations
f = (a * x ** m) + (b * x ** n) + c
which leaves the following in f_tex:
9 x^{- \frac{1}{2}} -4 x^{3} 4

Categories

Resources