Edit: more details
Hello I found this problem through one of my teachers but I still don't understand how to approach to it, and I would like to know if anyone had any ideas for it:
Create a program capable of generating systems of equations (randomly) that contain between 2 and 8 variables. The program will ask the user for a number of variables in the system of equations using the input function. The range of the coefficients must be between [-10,10], however, no coefficient should be 0. Both the coefficients and the solutions must be integers.
The goal is to print the system and show the solution to the variables (x,y,z,...). NumPy is allowed.
As far as I understand it should work this way:
Enter the number of variables: 2
x + y = 7
4x - y =3
x = 2
y = 5
I'm still learning arrays in python, but do they work the same as in matlab?
Thank you in advance :)!
For k variables, the lhs of the equations will be k number of unknowns and a kxk matrix for the coefficients. The dot product of those two should give you the rhs. Then it's a simple case of printing that however you want.
import numpy as np
def generate_linear_equations(k):
coeffs = [*range(-10, 0), *range(1, 11)]
rng = np.random.default_rng()
return rng.choice(coeffs, size=(k, k)), rng.integers(-10, 11, k)
k = int(input('Enter the number of variables: '))
if not 2 <= k <= 8:
raise ValueError('The number of variables must be between 2 and 8.')
coeffs, variables = generate_linear_equations(k)
solution = coeffs.dot(variables)
symbols = 'abcdefgh'[:k]
for row, sol in zip(coeffs, solution):
lhs = ' '.join(f'{r:+}{s}' for r, s in zip(row, symbols)).lstrip('+')
print(f'{lhs} = {sol}')
print()
for s, v in zip(symbols, variables):
print(f'{s} = {v}')
Which for example can give
Enter the number of variables: 3
8a +6b -4c = -108
9a -9b -4c = 3
10a +10b +9c = -197
a = -9
b = -8
c = -3
If you specifically want the formatting of the lhs to have a space between the sign and to not show a coefficient if it has a value of 1, then you need something more complex. Substitute lhs for the following:
def sign(n):
return '+' if n > 0 else '-'
lhs = ' '.join(f'{sign(r)} {abs(r)}{s}' if r not in (-1, 1) else f'{sign(r)} {s}' for r, s in zip(row, symbols))
lhs = lhs[2:] if lhs.startswith('+') else f'-{lhs[2:]}'
I did this by randomly generating the left hand side and the solution within your constraints, then plugging the solutions into the equations to generate the right hand side. Feel free to ask for clarification about any part of the code.
import numpy as np
num_variables = int(input('Number of variables:'))
valid_integers = np.asarray([x for x in range(-10,11) if x != 0])
lhs = np.random.choice(valid_integers, lhs_shape)
solution = np.random.randint(-10, 11, num_variables)
rhs = lhs.dot(solution)
for i in range(num_variables):
for j in range(num_variables):
symbol = '=' if j == num_variables-1 else '+'
print(f'{lhs[i, j]:3d}*x{j+1} {symbol} ', end='')
print(rhs[i])
for i in range(num_variables):
print(f'x{i+1} = {solution[i]}'
Example output:
Number of variables:2
2*x1 + -7*x2 = -84
-4*x1 + 1*x2 = 38
x1 = -7
x2 = 10
In my research I'm trying to tackle the Kolmogorov backward equation, i.e. interested in
$$Af = b(x)f'(x)+\sigma(x)f''(x)$$
With the specific b(x) and \sigma(x), I'm trying to see how fast the coefficients of the expression are growing when calculating higher Af powers. I'm struggle to derive this analytically thus tried to see the trend empirically.
First, I have used sympy:
from sympy import *
import matplotlib.pyplot as plt
import re
import math
import numpy as np
import time
np.set_printoptions(suppress=True)
x = Symbol('x')
b = Function('b')(x)
g = Function('g')(x)
def new_coef(gamma, beta, coef_minus2, coef_minus1, coef):
return expand(simplify(gamma*coef_minus2 + beta*coef_minus1 + 2*gamma*coef_minus1.diff(x)\
+beta*coef.diff(x)+gamma*coef.diff(x,2)))
def new_coef_first(gamma, beta, coef):
return expand(simplify(beta*coef.diff(x)+gamma*coef.diff(x,2)))
def new_coef_second(gamma, beta, coef_minus1, coef):
return expand(simplify(beta*coef_minus1 + 2*gamma*coef_minus1.diff(x)\
+beta*coef.diff(x)+gamma*coef.diff(x,2)))
def new_coef_last(gamma, beta, coef_minus2):
return lambda x: gamma(x)*coef_minus2(x)
def new_coef_last(gamma, beta, coef_minus2):
return expand(simplify(gamma*coef_minus2 ))
def new_coef_second_to_last(gamma, beta, coef_minus2, coef_minus1):
return expand(simplify(gamma*coef_minus2 + beta*coef_minus1 + 2*gamma*coef_minus1.diff(x)))
def set_to_zero(expression):
expression = expression.subs(Derivative(b, x, x, x), 0)
expression = expression.subs(Derivative(b, x, x), 0)
expression = expression.subs(Derivative(g, x, x, x, x), 0)
expression = expression.subs(Derivative(g, x, x, x), 0)
return expression
def sum_of_coef(expression):
sum_of_coef = 0
for i in str(expression).split(' + '):
if i[0:1] == '(':
i = i[1:]
integers = re.findall(r'\b\d+\b', i)
if len(integers) > 0:
length_int = len(integers[0])
if i[0:length_int] == integers[0]:
sum_of_coef += int(integers[0])
else:
sum_of_coef += 1
else:
sum_of_coef += 1
return sum_of_coef
power = 6
charar = np.zeros((power, power*2), dtype=Symbol)
coef_sum_array = np.zeros((power, power*2))
charar[0,0] = b
charar[0,1] = g
coef_sum_array[0,0] = 1
coef_sum_array[0,1] = 1
for i in range(1, power):
#print(i)
for j in range(0, (i+1)*2):
#print(j, ':')
#start_time = time.time()
if j == 0:
charar[i,j] = set_to_zero(new_coef_first(g, b, charar[i-1, j]))
elif j == 1:
charar[i,j] = set_to_zero(new_coef_second(g, b, charar[i-1, j-1], charar[i-1, j]))
elif j == (i+1)*2-2:
charar[i,j] = set_to_zero(new_coef_second_to_last(g, b, charar[i-1, j-2], charar[i-1, j-1]))
elif j == (i+1)*2-1:
charar[i,j] = set_to_zero(new_coef_last(g, b, charar[i-1, j-2]))
else:
charar[i,j] = set_to_zero(new_coef(g, b, charar[i-1, j-2], charar[i-1, j-1], charar[i-1, j]))
#print("--- %s seconds for expression---" % (time.time() - start_time))
#start_time = time.time()
coef_sum_array[i,j] = sum_of_coef(charar[i,j])
#print("--- %s seconds for coeffiecients---" % (time.time() - start_time))
coef_sum_array
Then, looked into automated differentiation and used autograd:
import autograd.numpy as np
from autograd import grad
import time
np.set_printoptions(suppress=True)
b = lambda x: 1 + x
g = lambda x: 1 + x + x**2
def new_coef(gamma, beta, coef_minus2, coef_minus1, coef):
return lambda x: gamma(x)*coef_minus2(x) + beta(x)*coef_minus1(x) + 2*gamma(x)*grad(coef_minus1)(x)\
+beta(x)*grad(coef)(x)+gamma(x)*grad(grad(coef))(x)
def new_coef_first(gamma, beta, coef):
return lambda x: beta(x)*grad(coef)(x)+gamma(x)*grad(grad(coef))(x)
def new_coef_second(gamma, beta, coef_minus1, coef):
return lambda x: beta(x)*coef_minus1(x) + 2*gamma(x)*grad(coef_minus1)(x)\
+beta(x)*grad(coef)(x)+gamma(x)*grad(grad(coef))(x)
def new_coef_last(gamma, beta, coef_minus2):
return lambda x: gamma(x)*coef_minus2(x)
def new_coef_second_to_last(gamma, beta, coef_minus2, coef_minus1):
return lambda x: gamma(x)*coef_minus2(x) + beta(x)*coef_minus1(x) + 2*gamma(x)*grad(coef_minus1)(x)
power = 6
coef_sum_array = np.zeros((power, power*2))
coef_sum_array[0,0] = b(1.0)
coef_sum_array[0,1] = g(1.0)
charar = [b, g]
for i in range(1, power):
print(i)
charar_new = []
for j in range(0, (i+1)*2):
if j == 0:
new_funct = new_coef_first(g, b, charar[j])
elif j == 1:
new_funct = new_coef_second(g, b, charar[j-1], charar[j])
elif j == (i+1)*2-2:
new_funct = new_coef_second_to_last(g, b, charar[j-2], charar[j-1])
elif j == (i+1)*2-1:
new_funct = new_coef_last(g, b, charar[j-2])
else:
new_funct = new_coef(g, b, charar[j-2], charar[j-1], charar[j])
coef_sum_array[i,j] = new_funct(1.0)
charar_new.append(new_funct)
charar = charar_new
coef_sum_array
However, I'm so not happy with the speed of either of them. I would like to do at least thousand iterations while after 3 days of running simpy method, I got 30 :/
I expect that the second method (numerical) could be optimized to avoid recalculating expressions every time. Unfortunately, I cannot see that solution myself. Also, I have tried Maple, but again without luck.
Overview
So, there are two formulas about derivatives that are interesting here:
Faa di Bruno's formula which is a way to quickly find the n-th derivative of f(g(x)) , and looks a lot like the Multinomial theorem
The General Leibniz rule which is a way to quickly find the n-th derivative of f(x)*g(x) and looks a lot like the Binomial theorem
Both of these have been discussed in pull request #13892 the n-th derivative was sped up using the general Leibniz rule.
I'm trying to see how fast the coefficients of the expression are growing
In your code, the general formula for computing c[i][j] is this:
c[i][j] = g * c[i-1][j-2] + b * c[i-1][j-1] + 2 * g * c'[i-1][j-1] + g * c''[i-1][j]
(where by c'[i][j] and c''[i][j] are the 1st and 2nd derivatives of c[i][j])
Because of this, and by the Leibniz rule mentioned above, I think intuitively, the coefficients computed should be related to Pascal's triangle (or at the very least they should have some combinatorial relation).
Optimization #1
In the original code, the function sum_to_coef(f) serializes the expression f to a string and then discarding everything that doesn't look like a number, and then sums the remaining numbers.
We can avoid serialization here by just traversing the expression tree and collecting what we need
def sum_of_coef(f):
s = 0
if f.func == Add:
for sum_term in f.args:
res = sum_term if sum_term.is_Number else 1
if len(sum_term.args) == 0:
s += res
continue
first = sum_term.args[0]
if first.is_Number == True:
res = first
else:
res = 1
s += res
elif f.func == Mul:
first = f.args[0]
if first.is_Number == True:
s = first
else:
s = 1
elif f.func == Pow:
s = 1
return s
Optimization #2
In the function set_to_zero(expr) all the 2nd and 3rd derivatives of b, and the 3rd and 4th derivatives of g are replaced by zero.
We can collapse all those substitutions into one statement like so:
b3,b2=b.diff(x,3),b.diff(x,2)
g4,g3=g.diff(x,4),g.diff(x,3)
def set_to_zero(expression):
expression = expression.subs({b3:0,b2:0,g4:0,g3:0})
return expression
Optimization #3
In the original code, for every cell c[i][j] we're calling simplify. This turns out to have a big impact on performance but actually we can skip this call, because fortunately our expressions are just sums of products of derivatives or unknown functions.
So the line
charar[i,j] = set_to_zero(expand(simplify(expr)))
becomes
charar[i,j] = set_to_zero(expand(expr))
Optimization #4
The following was also tried but turned out to have very little impact.
For two consecutive values of j, we're computing c'[i-1][j-1] twice.
j-1 c[i-1][j-3] c[i-1][j-2] c[i-1][j-1]
j c[i-1][j-2] c[i-1][j-1] c[i-1][j]
If you look at the loop formula in the else branch, you see that c'[i-1][j-1] has already been computed. It can be cached, but this optimization
has little effect in the SymPy version of the code.
Here it's also important to mention that it's possible to visualize the call tree of SymPy involved in computing these derivatives. It's actually larger, but here is part of it:
We can also generate a flamegraph using the py-spy module just to see where time is being spent:
As far as I could tell, 34% of the time spent in _eval_derivative_n_times , 10% of the time spent in the function getit from assumptions.py , 12% of the time spent in subs(..) , 12% of the time spent in expand(..)
Optimization #5
Apparently when pull request #13892 was merged into SymPy, it also introduced a performance regression.
In one of the comments regarding that regression, Ondrej Certik recommends using SymEngine to improve performance of code that makes heavy use of derivatives.
So I've ported the code mentioned to SymEngine.py and noticed that it runs 98 times faster than the SymPy version for power=8 ( and 4320 times faster for power=30)
The required module can be installed via pip3 install --user symengine.
#!/usr/bin/python3
from symengine import *
import pprint
x=var("x")
b=Function("b")(x)
g=Function("g")(x)
b3,b2=b.diff(x,3),b.diff(x,2)
g4,g3=g.diff(x,4),g.diff(x,3)
def set_to_zero(e):
e = e.subs({b3:0,b2:0,g4:0,g3:0})
return e
def sum_of_coef(f):
s = 0
if f.func == Add:
for sum_term in f.args:
res = 1
if len(sum_term.args) == 0:
s += res
continue
first = sum_term.args[0]
if first.is_Number == True:
res = first
else:
res = 1
s += res
elif f.func == Mul:
first = f.args[0]
if first.is_Number == True:
s = first
else:
s = 1
elif f.func == Pow:
s = 1
return s
def main():
power = 8
charar = [[0] * (power*2) for x in range(power)]
coef_sum_array = [[0] * (power*2) for x in range(power)]
charar[0][0] = b
charar[0][1] = g
init_printing()
for i in range(1, power):
jmax = (i+1)*2
for j in range(0, jmax):
c2,c1,c0 = charar[i-1][j-2],charar[i-1][j-1],charar[i-1][j]
#print(c2,c1,c0)
if j == 0:
expr = b*c0.diff(x) + g*c0.diff(x,2)
elif j == 1:
expr = b*c1 + 2*g*c1.diff(x) + b*c0.diff(x) + g*c0.diff(x,2)
elif j == jmax-2:
expr = g*c2 + b*c1 + 2*g*c1.diff(x)
elif j == jmax-1:
expr = g*c2
else:
expr = g*c2 + b*c1 + 2*g*c1.diff(x) + b*c0.diff(x) + g*c0.diff(x,2)
charar[i][j] = set_to_zero(expand(expr))
coef_sum_array[i][j] = sum_of_coef(charar[i][j])
pprint.pprint(Matrix(coef_sum_array))
main()
Performance after optimization #5
I think it would be very interesting to look at the number of terms in c[i][j] to determine how quickly the expressions are growing. That would definitely help in estimating the complexity of the current code.
But for practical purposes I've plotted the current time and memory consumption of the SymEngine code above and managed to get the following chart:
Both the time and the memory seem to be growing polynomially with the input (the power parameter in the original code).
The same chart but as a log-log plot can be viewed here:
Like the wiki page says, a straight line on a log-log plot corresponds to a monomial. This offers a way to recover the exponent of the monomial.
So if we consider two points N=16 and N=32 between which the log-log plot looks like a straight line
import pandas as pd
df=pd.read_csv("modif6_bench.txt", sep=',',header=0)
def find_slope(col1,col2,i1,i2):
xData = df[col1].to_numpy()
yData = df[col2].to_numpy()
x0,x1 = xData[i1],xData[i2]
y0,y1 = yData[i1],yData[i2]
m = log(y1/y0)/log(x1/x0)
return m
print("time slope = {0:0.2f}".format(find_slope("N","time",16,32)))
print("memory slope = {0:0.2f}".format(find_slope("N","memory",16,32)))
Output:
time slope = 5.69
memory slope = 2.62
So very rough approximation of time complexity would be O(n^5.69) and an approximation of space complexity would be O(2^2.62).
There are more details about deciding whether the growth rate is polynomial or exponential here (it involves drawing a semi-log and a log-log plot, and seeing where the data shows up as a straight line).
Performance with defined b and g functions
In the first original code block, the functions b and g were undefined functions. This means SymPy and SymEngine didn't know anything about them.
The 2nd original code block defines b=1+x and g=1+x+x**2 . If we run all of this again with known b and g the code runs much faster, and the running time curve and the memory usage curves are better than with unknown functions
time slope = 2.95
memory slope = 1.35
Recorded data fitting onto known growth-rates
I wanted to look a bit more into matching the observed resource consumption(time and memory), so I wrote the following Python module that fits each growth rate (from a catalog of common such growth rates) to the recorded data, and then shows the plot to the user.
It can be installed via pip3 install --user matchgrowth
When run like this:
match-growth.py --infile ./tests/modif7_bench.txt --outfile time.png --col1 N --col2 time --top 1
It produces graphs of the resource usage, as well as the closest growth rates it matches to. In this case, it finds the polynomial growth to be closest:
Other notes
If you run this for power=8 (in the symengine code mentioned above) the coefficients will look like this:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 5, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 17, 40, 31, 9, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 53, 292, 487, 330, 106, 16, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 161, 1912, 6091, 7677, 4693, 1520, 270, 25, 1, 0, 0, 0, 0, 0, 0]
[1, 485, 11956, 68719, 147522, 150706, 83088, 26573, 5075, 575, 36, 1, 0, 0, 0, 0]
[1, 1457, 73192, 735499, 2568381, 4118677, 3528928, 1772038, 550620, 108948, 13776, 1085, 49, 1, 0, 0]
[1, 4373, 443524, 7649215, 42276402, 102638002, 130209104, 96143469, 44255170, 13270378, 2658264, 358890, 32340, 1876, 64, 1]
So as it turns out, the 2nd column coincides with A048473 which according to OEIS is "The number of triangles (of all sizes, including holes) in Sierpiński's triangle after n inscriptions".
All the code for this is also available in this repo.
Relations between polynomial coefficients from the i-th line with coefficients from the (i-1)-th line
In the previous post c[i][j] was calculated. It's possible to check that deg(c[i][j])=j+1 .
This can be checked by initializing a separate 2d array, and computing the degree like so:
deg[i][j] = degree(poly(parse_expr(str(charar[i][j]))))
Vertical formulas:
Then if we denote by u(i,j,k) the coefficient of x^k in c[i][j] , we can try to find formulas for u(i,j,k) in terms of u(i-1,_,_). Formulas for u(i,j,_) will be the same as formulas for u(i+1,j,_) (and all following rows), so there's some oportunity there for caching.
Horizontal formulas:
It's also interesting that when we fix i, and find that the formulas for u(i,j,_) look the same as they do for u(i,j+1,_) except for the last 3 values of k. But I'm not sure if this can be leveraged.
The caching steps mentioned above might help to skip unnecessary computations.
See more about this here.
Some notes about analytical, closed-form solutions and asymptotics
I'm struggling to derive this analytically
Yes, this seems to be hard. The closest class of recursive sequences related to the one mentioned here are called Holonomic sequences (also called D-finite or P-recursive). The sequence c[i][j] is not C-finite because it has polynomial coefficients (in the general case even the asymptotics of recurrences with polynomial coefficients is an open problem).
However, the recurrence relation for c[i][j] does not qualify for this because of the derivatives. If we were to leave out the derivatives in the formula of c[i][j] then it would qualify as a Holonomic sequence. Here are some places where I found solutions for these:
"The Concrete Tetrahedron: Symbolic Sums, Recurrence Equations, Generating Functions, Asymptotic Estimates" by Kauers and Paule - Chapter 7 Holonomic Sequences and Power Series
Analytic Combinatorics by Flajolet and Sedgewick - Appendix B.4 Holonomic Functions
But also c[i][j] is a several variable recurrence, so that's another reason why it doesn't fit into that theory mentioned above.
There is however another book called Analytic Combinatorics in Several Variables by Robin Pemantle and Mark C. Wilson which does handle several variable recurrences.
All the books mentioned above require a lot of complex analysis, and they go much beyond the little math that I currently know, so hopefully someone with a more solid understanding of that kind of math can try this out.
The most advanced CAS that has generating-function-related operations and can operate on this kind of sequences is Maple and the gfun package gfun repo (which for now only handles the univariate case).