I'm playing around with SymPy and it is very powerful. However, I would like to get it to 'slow down' and solve pieces of an equation at a time instead of most of the equation. For instance, given an input string equation (assuming the correct form) like
9x-((17-3)(4x)) - 8(34x)
I would like to first solve
9x-((14)(4x)) - 8(34x)
And then
9x-(56x) - 8(34x)
and then
9x-(56x) - 272x
And so on.
Another example,
from sympy import *
s = (30*(5*(5-10)-10*x))+10
s2 = expand(s, basic=False)
Gives me -300*x - 740 in one step, and I just want a single * done at a time
Reading the ideas document produced as a result of the Google Summer of Code, this appears to be something yet to be added to the library. As it stands there is no way of doing this for your example, without completely coding something yourself.
The issue of converting algorithms that are not equivalent to human workings, into discrete steps, is discussed and highlighted in the above document. I'm not sure if that'd be an issue in the implementation of expansion, but it's certainly an issue for other algorithms, which machines compute differently for reasons of efficiency.
tl;dr This library doesn't support step-by-step breakdowns for your example. Only the manualintegrate function currently has step-by-step workings.
Related
It is often said that compiled languages such as C perform better than interpreted languages such as Python. Therefore, I might be interested in migrating Python implementations to C/C++ (assuming they also have access to some Z3 API that is in use and maintenance).
However, this migration only makes sense in one case: if my performance loss is due to the language and not due to Z3. Therefore, I would like to know if there is any way to know what percentage of execution is being executed by Z3 and what percentage by pure Python.
A very naive possibility would be to use a timer just before and after each call to Z3 in my implementation and add up those times to finally see how much of the total those times represent. A sketch of this idea (pseudo-code):
timer_total = 0
time_z3 = 0
while executing:
time_local = take_time()
call_z3()
time_z3 += take_time() - time_local
print(time_z3/time_total)
This, even though it is an ugly solution, would answer my first question: how long does Z3 take over the total execution.
However, I would want to get even more information, if possible: I want to know not only how long Z3 takes to do its computations, but also whether using Python causes Z3 to have to do large data transformations before the information arrives "pure" (i.e., as if I wrote it in Z3) to Z3 and that, therefore, Z3's time has been considerably more than it would be if it didn't have to do them. In other words: I would like to know how long Z3 is only with the part of doing the logical calculations (not transformations and other processes), but only looking for models.
Specifically: I want to know if other languages like C++ do these transformations cheaper and therefore is the Z3 API of some other language more recommended/effective/optimized than Python.
I know it's abstract, but I hope the question was understood, and if not, we can discuss it in the comments.
This question may be half computational math, half programming.
I'm trying to estimate log[\int_0^\infty\int_0^\infty f(x,y)dxdy] [actually thousands of such integrals] in Python. The function f(x,y) involves some very large/very small numbers that are bound to cause overflow/underflow errors; so I'd really prefer to work with log[f(x,y)] instead of f(x,y).
Thus my question is two parts:
1) Is there a way to estimate log[\int_0^\infty\int_0^\infty f(x,y)dxdy] using the log of the function instead of the function itself?
2) Is there an implementation of this in Python?
Thanks
I would be surprised if the math and/or numpy libraries or perhaps some more specific third party libraries would not be able to solve a problem like this. Here are some of their log functions:
math.log(x[, base]), math.log1p(x), math.log2(x), math.log10(x) (https://docs.python.org/3.3/library/math.html)
numpy.log, numpy.log10, numpy.log2, numpy.log1p, numpy.logaddexp, numpy.logaddexp2 (https://numpy.org/doc/stable/reference/routines.math.html#exponents-and-logarithms)
Generally, Just google: "logarithm python library" and try to identify similar stackoverflow problems, which will allow you to find the right libraries and functions to try out. Once you do that, then you can follow this guide, so that someone can try to help you get from input to expected output: How to make good reproducible pandas examples
After using scipy.integrate for a while I am at the point where I need more functions like bifurcation analysis or parameter estimation. This is why im interested in using the PyDSTool, but from the documentation I can't figure out how to work with ModelSpec and if this is actually what will lead me to the solution.
Here is a toy example of what I am trying to do: I have a network with two nodes, both having the same (SIR) dynamic, described by two ODEs, but different initial conditions. The equations are coupled between nodes via the Epsilon (see formula below).
formulas as a picture for better read, the 'n' and 'm' are indices, not exponents ~>
http://image.noelshack.com/fichiers/2014/28/1404918182-odes.png
(could not use the upload on stack, sadly)
In the two node case my code (using PyDSTool) looks like this:
#multiple SIR metapopulations
#parameter and initial condition definition; a dict is a must
import PyDSTool as pdt
params={'alpha': 0.7, 'beta':0.1, 'epsilon1':0.5,'epsilon2':0.5}
ini={'s1':0.99,'s2':1,'i1':0.01,'i2':0.00}
DSargs=pdt.args(name='SIRtest_multi',
ics=ini,
pars=params,
tdata=[0,20],
#the for-macro generates formulas for s1,s2 and i1,i2;
#sum works similar but sums over the expressions in it
varspecs={'s[o]':'for(o,1,2,-alpha*s[o]*sum(k,1,2,epsilon[k]*i[k]))',
'i[l]':'for(l,1,2,alpha*s[l]*sum(m,1,2,epsilon[m]*i[m]))'})
#generator
DS = pdt.Generator.Vode_ODEsystem(DSargs)
#computation, a trajectory object is generated
trj=DS.compute('test')
#extraction of the points for plotting
pts=trj.sample()
#plotting; pylab is imported along with PyDSTool as plt
pdt.plt.plot(pts['t'],pts['s1'],label='s1')
pdt.plt.plot(pts['t'],pts['i1'],label='i1')
pdt.plt.plot(pts['t'],pts['s2'],label='s2')
pdt.plt.plot(pts['t'],pts['i2'],label='i2')
pdt.plt.legend()
pdt.plt.xlabel('t')
pdt.plt.show()
But in my original problem, there are more than 1000 nodes and 5 ODEs for each, every node is coupled to a different number of other nodes and the epsilon values are not equal for all the nodes. So tinkering with this syntax did not led me anywhere near the solution yet.
What I am actually thinking of is a way to construct separate sub-models/solver(?) for every node, having its own parameters (epsilons, since they are different for every node). Then link them to each other. And this is the point where I do not know wether it is possible in PyDSTool and if it is the way to handle this kind of problems.
I looked through the examples and the Docs of PyDSTool but could not figure out how to do it, so help is very appreciated! If the way I'm trying to do things is unorthodox or plain stupid, you are welcome to make suggestions how to do it more efficiently. (Which is actually more efficient/fast/better way to solve problems like this: subdivide it into many small (still not decoupled) models/solvers or one containing all the ODEs at once?)
(Im neither a mathematician nor a programmer, but willing to learn, so please be patient!)
The solution is definitely not to build separate simulation models. That won't work because so many variables will be continuously coupled between the sub-models. You absolutely must have all the ODEs in one place together.
It sounds like the solution you need is to use the ModelSpec object constructs. These let you hierarchically build the sub-model definitions out of symbolic pieces. They can have their own "epsilon" parameters, etc. You declare all the pieces when you're finished and let PyDSTool make the final strings containing the ODE definitions for you. I suggest you look at the tutorial example at:
http://www.ni.gsu.edu/~rclewley/PyDSTool/Tutorial/Tutorial_compneuro.html
and the provided examples: ModelSpec_test.py, MultiCompartments.py. But, remember that you still have to have a source for the parameters and coupling data (i.e., a big matrix or dictionary loaded from a file) to be able to automate the process of building the model, otherwise you'd still be writing it all out by hand.
You have to build some classes for the components that you want to have. You might also create a factory function (compare 'makeSoma' in the neuralcomp.py toolbox) that will take all your sub-components and create an ODE based on summing something up from each of the declared components. At the end, you can refer to the parameters by their position in the hierarchy. One might be 's1.epsilon' while another might be 'i4.epsilon'.
Unfortunately, to build models like this efficiently you will have to learn to do some more complex programming! So start by understanding all the steps in the tutorial. You can email me directly through the SourceForge support discussions or email once you've got started and have specific questions.
I'm interested in programming languages that can reason about their own time complexity. To this end, it would be quite useful to have some way of representing time complexity programmatically, which would allow me to do things like:
f_time = O(n)
g_time = O(n^2)
h_time = O(sqrt(n))
fastest_asymptotically = min(f_time, g_time, h_time) # = h_time
total_time = f_time.inside(g_time).followed_by(h_time) # = O(n^3)
I'm using Python at the moment, but I'm not particularly tied to a language. I've experimented with sympy, but I haven't been able to find what I need out of the box there.
Is there a library that provides this capability? If not, is there a simple way to do the above with a symbolic math library?
EDIT: I wrote a simple library following #Patrick87's advice and it seems to work. I'm still interested if there are other solutions for this problem, though.
SymPy currently only supports the expansion at 0 (you can simulate other finite points by performing a shift). It doesn't support the expansion at infinity, which is what is used in algorithmic analysis.
But it would be a good base package for it, and if you implement it, we would gladly accept a patch (nb: I am a SymPy core developer).
Be aware that in general the problem is tricky, especially if you have two variables, or even symbolic constants. It's also tricky if you want to support oscilitory functions. EDIT: If you are interested in oscillating functions, this SymPy mailing list discussion gives some interesting papers.
EDIT 2: And I would recommend against trying to build this yourself from scratch, without the use of a computer algebra system. You will end up having to write your own computer algebra system, which is a lot of work, and even more work if you want to do it right and not have it be slow. There are already tons of systems already existing, including many that can act as libraries for code to be built on top of them (such as SymPy).
Actually you are building/finding a Expression Simplifier which can deal with:
+ (in your terms: followed_by)
***** (in your terms: inside)
^, log, ! (to represent the complexity)
variable (like n,m)
constant number (like that in 2^n)
For example, as you given f_time.inside(g_time).followed_by(h_time), It could be an expression like:
n*(n^2)+(n^(1/2))
, and you are expecting an processer to make it output as:n^3.
So in general speaking, you might want to use a common expression simplifier (if you want it to be interesting, go to check how Mathemetica does it) to get a simplified expression like n^3+n^(1/2), and then you need an additional processor to choose the term with highest complexity from the expression and get rid of the other terms. That would be easy, just use a table to define the complexity order of each kind of symbol.
Please note that in this case, the expressions are just symbol, you should write it as something like string (For your example: f_time = "O(n)"), not as functions.
If you're only working with big-O notation and are interested in whether one function grows more or less quickly than another, asymptotically speaking...
Given functions f and g
Compute the limit as n goes to infinity of f(n)/g(n) using a computer algebra package
If the limit diverges to +infinity, then f > g - in the sense that g = O(f), but f != O(g).
If the limit diverges to 0, then g < f.
If the limit converges to a finite number, then f = g (in the sense that f = O(g) and g = O(f))
If the limit is undefined... beats me!
It would not be my intention to put a link on my blog, but I don't have any other method to clarify what I really mean. The article is quite long, and it's in three parts (1,2,3), but if you are curious, it's worth the reading.
A long time ago (5 years, at least) I programmed a python program which generated "mathematical bacteria". These bacteria are python objects with a simple opcode-based genetic code. You can feed them with a number and they return a number, according to the execution of their code. I generate their genetic codes at random, and apply an environmental selection to those objects producing a result similar to a predefined expected value. Then I let them duplicate, introduce mutations, and evolve them. The result is quite interesting, as their genetic code basically learns how to solve simple equations, even for values different for the training dataset.
Now, this thing is just a toy. I had time to waste and I wanted to satisfy my curiosity.
however, I assume that something, in terms of research, has been made... I am reinventing the wheel here, I hope. Are you aware of more serious attempts at creating in-silico bacteria like the one I programmed?
Please note that this is not really "genetic algorithms". Genetic algorithms is when you use evolution/selection to improve a vector of parameters against a given scoring function. This is kind of different. I optimize the code, not the parameters, against a given scoring function.
If you are optimising the code, perhaps you are engaged in genetic programming?
The free utility Eureqa is similar in the sense that in can create fitting symbolic functions (much more complicated than simple linear regression, etc.) based on multivariate input data. But, it uses GA to come up with the functions, so I'm not sure if that's exactly what you had in mind.
See also the "Download Your Own Robot Scientist" article on Wired for a breakdown of the general idea of how it works.
Nice article,
I would say you're talking about "gene expression programming" rather than "genetic programming", btw.
Are you familiar with Core Wars? I remember there were a number of code evolvers written for the game which had some success. For example, MicroGP++ is an assembly code generator that can be applied to the Core Wars assembly language (as well as to real problems!).