I need to write some code that deals with generating and manipulating multivariable polynomials. I'll outline my task with a simplified example.
Lets say I am given three expressions: 2x^2, 3y + 1, and 1z. I then need to multiply these together which would give me 6x^2yz + 2x^2z. Then I would like to find the partial derivatives of this expression with respect to x, y, and z. This would give me 12xyz + 4xz, 6x^2z, and 6x^2y + 2x^2.
My problem deals with doing simple manipulations like this on expressions containing thousands of variables and I need an easy way to do this systematically. I would really like to use python since I already have a lot of project related functionality completed using numpy/scipy/matplotlib, but if there is a robust toolbox out there in another language I am open to using that as well. I am doing university research so I am open to using Matlab as well.
I haven't been able to find any good python libraries that could do this for me easily and ideally would like something similar to the scipy polynomial routines that could work on multidimensional polynomials. Does anyone know of a good library that seems suitable for this problem and that would be easy to integrate into already existing python code?
Thanks!
Follow up: I spent a couple of days working with sympy which turned out to be very easy to use. However, it was much to slow for the size of the problem I am working on so I will now go explore matlab. To give an extremely rough estimate of the speed using a small sample size, it took approximately 5 seconds to calculate each of the partial derivatives of an order 2 polynomial containing 250 variables.
Follow up #2: I probably should have done this back when I was still working on this problem, but I might as well let everyone know that the matlab symbolic library was extremely comparable in speed to sympy. In other words, it was brutally slow for large computations. Both libraries were amazingly easy to work with, so for small computations I do highly recommend either.
To solve my problem I computed the gradients by hand, simplified them, and then used the patterns I found to hard code some values in my code. It was more work, but made my code exponentially faster and finally usable.
Sympy is perfect for this: http://code.google.com/p/sympy/
Documentation: http://docs.sympy.org/
Examples of differentiation from the tutorial: http://docs.sympy.org/tutorial.html#differentiation
import sympy
x, y, z = sympy.symbols('xyz')
p1 = 2*x*x
p2 = 3*y + 1
p3 = z
p4 = p1*p2*p3
print p4
print p4.diff(x)
print p4.diff(y)
print p4.diff(z)
Output:
2*z*x**2*(1 + 3*y)
4*x*z*(1 + 3*y)
6*z*x**2
2*x**2*(1 + 3*y)
If you are using MATLAB, then the symbolic TB works well, IF you have it. If not, then use my sympoly toolbox. Just download it from the file exchange.
sympoly x y z
A = 2*x^2; B = 3*y + 1;C = 1*z;
gradient(A*B*C)
ans =
Sympoly array has size = [1 3]
Sympoly array element [1 1]
4*x*z + 12*x*y*z
Sympoly array element [1 2]
6*x^2*z
Sympoly array element [1 3]
2*x^2 + 6*x^2*y
Note that this points out that you made a mistake in differentiating the result with respect to z in your question.
Matlab and other tools you mention typically do numerical computing. You should consider using Mathematica or an alternative Computer Algebra System (CAS) for symbolic computation. See the wiki link: http://en.wikipedia.org/wiki/Comparison_of_computer_algebra_systems for a comparison of various CASs.
Related
I'm teaching myself linear algebra, and I'm trying to learn the corresponding Numpy and Sympy code alongside it.
My book presented the following matrix:
example1 = Matrix([[3,5,-4,0],[-3,-2,4,0],[6,1,-8,0]])
with the instructions to determine if there is a nontrivial solution. The final solution would be x = x3 * Matrix([[4\3],[0],[1]]). (Using Jupyter's math mode, I used the following to represent the solution:)
$$\pmb{x} =
\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix} =
\begin{bmatrix}\frac{4}{3}x_3\\0\\x_3\end{bmatrix} =
x_3\begin{bmatrix}\frac{4}{3}\\0\\1\end{bmatrix} \\
= x_3\pmb{v} \text{, where }\pmb{v} = \begin{bmatrix}\frac{4}{3}\\0\\1\end{bmatrix}$$
How can I now solve this in Sympy? I've looked through the documentation, but I didn't see anything, and I'm at a bit of a loss. I know that errors tend to be thrown for free variables. Is there a way to determine nontrivial solutions and the corresponding general solution using Sympy, considering that nontrivial solutions are reliant upon free variables? Or is np.linalg generally more preferred for this type of problem?
This is a linear system so
>>> linsolve(Matrix([[3,5,-4,0],[-3,-2,4,0],[6,1,-8,0]]))
FiniteSet((4*tau0/3, 0, tau0))
The tau0 is the free parameter that you refer to as x3.
I have a dataset including the distance and bearing to ~1800 fused radar contacts as well as the actual distance and bearing to those contacts, and I need to develop a correction equation to get the perceived values to be as close to the actual values as possible.
There seems to be a trend in the error when visualizing, so it seems to me that there should be a somewhat simple equation to correct it.
This is the form of the ~1800 equations:
actual_distance = perceived_distance + X(percieved_bearing) + Y(speed_over_ground) + Z(course_over_ground) + A(heading)
What is the best way to solve for X, Y, Z, and A?
Also, I'm not convinced that all of these factors are necessary, so I'm completely willing to leave out one or two of the factors.
From the little linear algebra I understand, I've attempted something like this with no luck:
Ax = b --> x = b/A via numpy.linalg.solve(A, b)
where A is the 4 x ~1800 matrix and b is the 1 x ~1800 matrix
Is this on the right track?
To be clear, I'm expecting to generate coefficeints for an equation that will correct percieved distance to a contact so that it is as close as possible to the actual distance to contact.
I am also totally willing to abandon this method if there is abetter one.
Thanks for your help in advance.
The best way to solve such a system of equations is to use the: Incomplete Cholesky Conjugate Gradient Technique (ICCG). This can be implemented in Matlab, Numerical recipes in C++, Nag Fortran or many other languages. Its very efficient. Basically you are inversing a large banded matrix. The book by Golub describes it in detail.
Looks like this is useful:
https://docs.scipy.org/doc/numpy-1.14.1/reference/generated/numpy.linalg.cholesky.html
When you have more equations than unknowns, you might not have an exact solution. In such a case what you can do is use the Moore-Penrose pseudoinverse of your matrix A. A times b will give you the least-square distance solution. In numpy you can use https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.html#numpy.linalg.lstsq
I want to solve a linear program in python. The number of variables (I will call it N from now on) is very large (~50000) and in order to formulate the problem in the way scipy.optimize.linprog requires it, I have to construct two N x N matrices (A and B below). The LP can be written as
minimize: c.x
subject to:
A.x <= a
B.x = b
x_i >= 0 for all i in {0, ..., n}
whereby . denotes the dot product and a, b, and c are vectors with length N.
My experience is that constructing such large matrices (A and B have both approx. 50000x50000 = 25*10^8 entries) comes with some issues: If the hardware is not very strong, NumPy may refuse to construct such big matrices at all (see for example Very large matrices using Python and NumPy) and even if NumPy creates the matrix without problems, there is a huge performance issue. This is natural regarding the huge amount of data NumPy has to deal with.
However, even though my linear program comes with N variables, the matrices I work with are very sparse. One of them has only entries in the very first row, the other one only in the first M rows, with M < N/2. Of course I would like to exploit this fact.
As far as I have read (e.g. Trying to solve a Scipy optimization problem using sparse matrices and failing), scipy.optimize.linprog does not work with sparse matrices. Therefore, I have the following questions:
Is it actually true that SciPy does not offer any possibility to solve a linear program with sparse matrices? (If not, how can I do it?)
Do you know any alternative library that will solve the problem more effectively than SciPy with non-sparse matrices? (The library suggested in the thread above seems to be not flexible enough for my purposes - as far as I understand its documentation)
Can it be expected that a new implementation of the simplex algorithm (using plain Python, no C) that exploits the sparsity of the matrices will be more efficient than SciPy with non-sparse matrices?
I would say forming a dense matrix (or two) to solve a large sparse LP is probably not the right thing to do. When solving a large sparse LP it is important to use a solver that has facilities to handle such problems and also to generate the model in a way that does not create explicitly any of these zero elements.
Writing a stable, fast, sparse Simplex LP solver in Python as a replacement for the SciPy dense solver is not a trivial exercise. Moreover a solver written in pure Python may not perform as well.
For the size you indicate, although not very, very large (may be large medium sized model would be a good classification) you may want to consider a commercial solver like Cplex, Gurobi or Mosek. These solvers are very fast and very reliable (they solve basically any LP problem you throw at them). They all have Python APIs. The solvers are free or very cheap for academics.
If you want to use an Open Source solver, you may want to look at the COIN CLP solver. It also has a Python interface.
If your model is more complex then you also may want to consider to use a Python modeling tool such as Pulp or Pyomo (Gurobi also has good modeling support in Python).
I can't believe nobody has pointed you in the direction of PuLP! You will be able to create your problem efficiently, like so:
import pulp
prob = pulp.LpProblem("test problem",pulp.LpMaximize)
x = pulp.LpVariable.dicts('x', range(5), lowBound=0.0)
prob += pulp.lpSum([(ix+1)*x[ix] for ix in range(5)]), "objective"
prob += pulp.lpSum(x)<=3, "capacity"
prob.solve()
for k, v in prob.variablesDict().iteritems():
print k, v.value()
PuLP is fantastic, comes with a very decent solver (CBC) and can be hooked up to open source and commercial solvers. I am currently using it in production for a forestry company and exploring Dippy for the hardest (integer) problems we have. Best of luck!
I'm playing around with SymPy and it is very powerful. However, I would like to get it to 'slow down' and solve pieces of an equation at a time instead of most of the equation. For instance, given an input string equation (assuming the correct form) like
9x-((17-3)(4x)) - 8(34x)
I would like to first solve
9x-((14)(4x)) - 8(34x)
And then
9x-(56x) - 8(34x)
and then
9x-(56x) - 272x
And so on.
Another example,
from sympy import *
s = (30*(5*(5-10)-10*x))+10
s2 = expand(s, basic=False)
Gives me -300*x - 740 in one step, and I just want a single * done at a time
Reading the ideas document produced as a result of the Google Summer of Code, this appears to be something yet to be added to the library. As it stands there is no way of doing this for your example, without completely coding something yourself.
The issue of converting algorithms that are not equivalent to human workings, into discrete steps, is discussed and highlighted in the above document. I'm not sure if that'd be an issue in the implementation of expansion, but it's certainly an issue for other algorithms, which machines compute differently for reasons of efficiency.
tl;dr This library doesn't support step-by-step breakdowns for your example. Only the manualintegrate function currently has step-by-step workings.
I have systems of polynomials, fairly simple polynomial expressions but rather long
to optimize my hand. Expressions are grouped in sets, and in a given set there are common terms in several variables.
I would like to know if there is a computer algebra system, such as Mathematica, Matlab, or sympy, which can optimize multiple polynomials with common terms to minimize number of operations. It would be also great if such system can minimize the number of intermediate terms to reduce number of registers.
If such system is not existing, I am going to do my own, using Python symbolic algebra Sympy. If you are working on such package or are interested in developing or using one, please let me know.
here is a made-up example
x0 = ((t - q*A)*x + B)*y
y0 = ((t - q*A)*y + B)*z
z0 = ((t - q*A)*z + B)*x
so you can obviously factor the (t - qA) term. Now if you make number of terms very large with various combinations of common terms it becomes difficult to do by hand. The equations I have involve up to 40 terms and the size of set is around 20. Hope that helps
Thank you
Is sympy what you're looking for? I do believe it has support for polynomials although I don't know if it supports all the features you may desire (still, tweaking it to add what you think it might be missing has to be easier than writing your own from scratch;-).
Have you considered Maxima?
It is an impressive symbolical computation package that is free, open source, and has a strong and active community that provides valuable assistance when dealing with non-obvious formulations. It is readily availability for all three major operating systems, and has a precompiled Windows binary.
You have a variety of algebraic manipulation commands available for expressions and for systems of equations (such as yours): expand, factor, simplify, ratsimp, linsolve, etc.
This page (Maxima for Symbolic Computation)should get you started — downloading, installing, a few examples, and then pointing out additional resources to guide you on your way, including a quick command reference / cheat sheet, and some guidlines for writing your own scripts.
Well Mathematica can certainly do all sorts of transformations on sets of polynomial equations such as yours, and some of those transformations could be to reduce the number of terms. Whether that is the right answer for you is open to question, as you don't seem to have a copy available. I expect that the same is true for Maple and for most of the other CAS out there.
But your mention of
reduce number of registers
suggests that you are actually trying to do some data-flow analysis for compilation. You might want to look at the literature on that topic too. Some of that literature does indeed refer to computer-algebra-like transformations on expressions.
I'm late to the party, but anyway there is a function optimize in Maxima (https://maxima.sourceforge.io) which identifies common subexpressions and emits a blob of code which can be evaluated. For the example shown in the problem statement, I get:
(%i11) optimize ([x0 = ((t-A*q)*x+B)*y,
y0 = ((t-A*q)*y+B)*z,
z0 = x*((t-A*q)*z+B)]);
(%o11) block([%1],
%1 : t - A q,
[x0 = (%1 x + B) y,
y0 = (%1 y + B) z,
z0 = x (%1 z + B)])
As you can see, t - A*q was pulled out and assigned to a made-up variable %1 (percent sign being an allowed character for symbols in Maxima) which is then reused to compute other results.
? optimize at the input prompt shows some documentation about it.