Convert my pyverilog AST to input for Z3 solver - python

I have converted my verilog file to AST(abstrct syntax tree) but along with external constraints like the output for the circuit and the AST is to be given to Z3/SMT solver which should give us the inputs for the circuit, but I have no idea how can I give AST as the inputs for Z3/SMT solver.
Thanks in advance.

Such a task typically amounts to walking over your AST and symbolically executing it, and generating a trace for the SMT solver. This is easier said then done, unfortunately: there are many facets of doing this translation and even when done fully, it is far from easy for a solver to verify the corresponding properties. For full Verilog, you'd have to essentially implement a Verilog simulator that can deal with symbolic values. While this can be a very large task, perhaps you can get away with a much smaller set of features, if your inputs are "simple" enough. Without knowing anything about how your Verilog is structured, it's really hard to say anything.
This paper, penned by the two main authors of Z3 (Nikolaj and Leonardo) provides a good survey of the approach. It's an excellent read with many useful references. Starting with that can at least give you an idea of what's involved.
I should add that verification of Verilog designs is a topic that has industrial applications, and there are vendor supported tools (not cheap!) to do verification at the Verilog level. The Jasper Gold tool from Cadence is one such example. Synopsys also has a similar tool.
It seems you are interested in test-case generation. That would correspond to writing a typical "cover" property, and reading off of the values to primary inputs that lead to the cover scenario in such a setting. Such properties are typically written in the SVA format, which is understood by such tools.

Related

Any comparison between different SMT solvers?

I have an implementation in Python that makes use of theorem proving. I would like to know if there is a possibility to speed up the SMT solving part, which is currently using Z3.
I am trying to discover different solvers, and have found cvc4/cvc5 and Yices as multiple-theory (arithmetics, equality, bitvectors...) solvers. I also found dReal and MetiTarski (this one seems to be out to date) for the concrete case of real arithmetics.
My intention is to test my implementation with those tools' APIs to see whether I can use one or other solver depending on the sort I want to solve.
However, I would like to know in advance if there is some kind of comparison between these solvers in order to have a more useful foundation for my findings. I am interested in both standard benchmarks or user-tests out there in GitHub or Stack.
I only found this paper of cvc5 (https://www-cs.stanford.edu/~preiner/publications/2022/BarbosaBBKLMMMN-TACAS22.pdf), which, obviously suggests it as the best option. I also found this minimal comparison (https://lemire.me/blog/2020/11/08/benchmarking-theorem-provers-for-programming-tasks-yices-vs-z3/) that tells us Yices is 15 times faster than Z3 for a concrete example.
Any advise?
Yices: https://yices.csl.sri.com/
cvc5: https://cvc5.github.io/
dReal: http://dreal.github.io/
MetiTarski:
https://www.cl.cam.ac.uk/~lp15/papers/Arith/index.html
You can always look at the results of SMT competition: https://smt-comp.github.io
Having said that, I think it’s a fool’s errand to look for the “best.” There isn’t a good yard stick to compare all solvers in a meaningful way: it all depends on your particular application.
If your system allows for using multiple backend solvers, why not take advantage of the many cores on modern machines: spawn all of them and take the result of first to complete. Any a priori selection of the solver will suffer from a case where another will perform better. At this point, running all available and taking the fastest result is the best option to utilize your hardware.

How do you robustly generate code, given a description of its exact behavior in another language?

I have been reverse engineering a specific black box equation that is part of a system I do not own (do not worry, it's white-hat), in which you can only measure the inputs (a large set of integers) and outputs (two integers).
This system can only be perfectly described as a program/function in which all the input integers are used, and so far I can perfectly describe the behavior by creating a data structure that has named "mathematical terms" in which each named input integer lives, and each term has an ordering for the inputs that it owns. I also have a function that takes the model description, and a set of named inputs, and outputs two integers. So the mapping of lists of input names to program behavior lives in here and in the model description in tandem.
I've been programming the reverse engineering utility in python, but ultimately I want to output a low level lua program that represents this function in a less abstract manner. When there were less terms in the model, it was simple to manually write a "transpiler" from this model (in python) to lua, but as the complexity grows it's painful to rewrite the code generator for new types of terms, especially in an ad-hoc manner.
From reading other questions about similar systems, it seems the very last two steps of this process would be: generating an abstract syntax tree representing my desired program, and giving the ast to a lua prettyprinter to generate the code. But I'm not sure if there's useful abstractions that I'm unaware of that help me generate a lua ast, given my current description of the model.
What you're looking for is an abstract syntax tree, which can define the behavior of a program through a graph. Since each component of an abstract syntax tree is highly compartmentalized (eg, "Add", "Number Constant"...) is it extraordinarily efficient to translate an abstract syntax tree back into a high-level programming language, such as Lua.
Abstract syntax trees are used in many compilers and transpilers, so you will not be digging long to find good examples.
CSharp.Lua does a similar thing to what you want; transpiling C# to Lua using a simple abstract syntax tree and a slightly less simple code generator.
Speedy Web Compiler contains an excellent implementation of a javascript code generator
ESBuild also has a well-done implementation for javascript.

solving ODEs on networks with PyDSTool

After using scipy.integrate for a while I am at the point where I need more functions like bifurcation analysis or parameter estimation. This is why im interested in using the PyDSTool, but from the documentation I can't figure out how to work with ModelSpec and if this is actually what will lead me to the solution.
Here is a toy example of what I am trying to do: I have a network with two nodes, both having the same (SIR) dynamic, described by two ODEs, but different initial conditions. The equations are coupled between nodes via the Epsilon (see formula below).
formulas as a picture for better read, the 'n' and 'm' are indices, not exponents ~>
http://image.noelshack.com/fichiers/2014/28/1404918182-odes.png
(could not use the upload on stack, sadly)
In the two node case my code (using PyDSTool) looks like this:
#multiple SIR metapopulations
#parameter and initial condition definition; a dict is a must
import PyDSTool as pdt
params={'alpha': 0.7, 'beta':0.1, 'epsilon1':0.5,'epsilon2':0.5}
ini={'s1':0.99,'s2':1,'i1':0.01,'i2':0.00}
DSargs=pdt.args(name='SIRtest_multi',
ics=ini,
pars=params,
tdata=[0,20],
#the for-macro generates formulas for s1,s2 and i1,i2;
#sum works similar but sums over the expressions in it
varspecs={'s[o]':'for(o,1,2,-alpha*s[o]*sum(k,1,2,epsilon[k]*i[k]))',
'i[l]':'for(l,1,2,alpha*s[l]*sum(m,1,2,epsilon[m]*i[m]))'})
#generator
DS = pdt.Generator.Vode_ODEsystem(DSargs)
#computation, a trajectory object is generated
trj=DS.compute('test')
#extraction of the points for plotting
pts=trj.sample()
#plotting; pylab is imported along with PyDSTool as plt
pdt.plt.plot(pts['t'],pts['s1'],label='s1')
pdt.plt.plot(pts['t'],pts['i1'],label='i1')
pdt.plt.plot(pts['t'],pts['s2'],label='s2')
pdt.plt.plot(pts['t'],pts['i2'],label='i2')
pdt.plt.legend()
pdt.plt.xlabel('t')
pdt.plt.show()
But in my original problem, there are more than 1000 nodes and 5 ODEs for each, every node is coupled to a different number of other nodes and the epsilon values are not equal for all the nodes. So tinkering with this syntax did not led me anywhere near the solution yet.
What I am actually thinking of is a way to construct separate sub-models/solver(?) for every node, having its own parameters (epsilons, since they are different for every node). Then link them to each other. And this is the point where I do not know wether it is possible in PyDSTool and if it is the way to handle this kind of problems.
I looked through the examples and the Docs of PyDSTool but could not figure out how to do it, so help is very appreciated! If the way I'm trying to do things is unorthodox or plain stupid, you are welcome to make suggestions how to do it more efficiently. (Which is actually more efficient/fast/better way to solve problems like this: subdivide it into many small (still not decoupled) models/solvers or one containing all the ODEs at once?)
(Im neither a mathematician nor a programmer, but willing to learn, so please be patient!)
The solution is definitely not to build separate simulation models. That won't work because so many variables will be continuously coupled between the sub-models. You absolutely must have all the ODEs in one place together.
It sounds like the solution you need is to use the ModelSpec object constructs. These let you hierarchically build the sub-model definitions out of symbolic pieces. They can have their own "epsilon" parameters, etc. You declare all the pieces when you're finished and let PyDSTool make the final strings containing the ODE definitions for you. I suggest you look at the tutorial example at:
http://www.ni.gsu.edu/~rclewley/PyDSTool/Tutorial/Tutorial_compneuro.html
and the provided examples: ModelSpec_test.py, MultiCompartments.py. But, remember that you still have to have a source for the parameters and coupling data (i.e., a big matrix or dictionary loaded from a file) to be able to automate the process of building the model, otherwise you'd still be writing it all out by hand.
You have to build some classes for the components that you want to have. You might also create a factory function (compare 'makeSoma' in the neuralcomp.py toolbox) that will take all your sub-components and create an ODE based on summing something up from each of the declared components. At the end, you can refer to the parameters by their position in the hierarchy. One might be 's1.epsilon' while another might be 'i4.epsilon'.
Unfortunately, to build models like this efficiently you will have to learn to do some more complex programming! So start by understanding all the steps in the tutorial. You can email me directly through the SourceForge support discussions or email once you've got started and have specific questions.

Partial evaluation with pyparsing

I need to be able to take a formula that uses the OpenDocument formula syntax, parse it into syntax that Python can understand, but without evaluating the variables, and then be able to evaluate the formula many times with changing valuables for the variables.
Formulas can be user input, so pyparsing allows me to both effectively handle the formula syntax, and clean user input. There are a number of good examples of pyparsing available, but all the mathematical ones seem to assume that one evaluates everything in the current scope immediately.
For context, I am working with a model of the industrial economy (life cycle assessment, or LCA), where these formulas represent the amount of material or energy exchanges between processes. The variable amount can be a function of several parameters, such as geographical location. THe chain of formula and variable references are stored in a directed acyclic graph, so that formulas can always be simply evaluated. Formulas are stored as strings in a database.
My questions are:
Is it possible to parse a formula such that the parsed evaluation can also be stored in the database (as a string to be evaled, or something else)?
Are there alternatives to this approach? Bear in mind that the ideal solution is to parse/write once, and read many times. For example, partially parsing the formula, and then using the ast module, although I don't know how this could work with database storage.
Any examples of a project or library similar to this that I could look over? I am not a programmer, just a student trying to finish his thesis while making an open-source LCA software model in my spare time.
Is this approach too slow? I would like to be able to do substantial Monte Carlo runs, where each run could involve tens of thousands of formula evaluations (it is a big database).
1) Yes, it is possible to pickle the results from parsing your expression, and save that to a database. Then you can just fetch and unpickle the expression, rather than reparse the original again.
2) You can do a quick-and-dirty pass at this just using the compile and eval built-ins, as in the following interactive session:
>>> y = compile("m*x+b","","eval")
>>> m = 100
>>> x = 5
>>> b = 1
>>> eval(y)
501
Of course, this has the security pitfalls of any eval- or exec-based implementation, in that untrusted or malicious source strings can embed harmful system calls. But if this is your thesis and entirely within your scope of control, just don't do anything foolish.
3) You can get an online example of parsing an expression into a "evaluatable" data structure at the pyparsing wiki's Examples page. Check out simpleBool.py and evalArith.py especially. If you're feeling flush, order a back issue of the May,2008 issue of Python magazine, which has my article "Writing a Simple Interpreter/Compiler with Pyparsing" with a more detailed description of the methods used, plus a description of how pickling and unpickling the parsed results works.
4) The slow part will be the parsing, so you are on the right track in preserving these results in some intermediate and repeatably-evaluatable form. The eval part should be fairly snappy. The second slow part will be in fetching these pickled structures from your database. During your MC run, I would package a single function that takes the selection parameters for an expression, fetches from the database, and unpickles and returns the evaluatable expression. Then once you have this working, use a memoize decorator to cache these query-results pairs, so that any given expression only needs to be fetched/unpickled once.
Good luck with your thesis!

Code bacteria: evolving mathematical behavior

It would not be my intention to put a link on my blog, but I don't have any other method to clarify what I really mean. The article is quite long, and it's in three parts (1,2,3), but if you are curious, it's worth the reading.
A long time ago (5 years, at least) I programmed a python program which generated "mathematical bacteria". These bacteria are python objects with a simple opcode-based genetic code. You can feed them with a number and they return a number, according to the execution of their code. I generate their genetic codes at random, and apply an environmental selection to those objects producing a result similar to a predefined expected value. Then I let them duplicate, introduce mutations, and evolve them. The result is quite interesting, as their genetic code basically learns how to solve simple equations, even for values different for the training dataset.
Now, this thing is just a toy. I had time to waste and I wanted to satisfy my curiosity.
however, I assume that something, in terms of research, has been made... I am reinventing the wheel here, I hope. Are you aware of more serious attempts at creating in-silico bacteria like the one I programmed?
Please note that this is not really "genetic algorithms". Genetic algorithms is when you use evolution/selection to improve a vector of parameters against a given scoring function. This is kind of different. I optimize the code, not the parameters, against a given scoring function.
If you are optimising the code, perhaps you are engaged in genetic programming?
The free utility Eureqa is similar in the sense that in can create fitting symbolic functions (much more complicated than simple linear regression, etc.) based on multivariate input data. But, it uses GA to come up with the functions, so I'm not sure if that's exactly what you had in mind.
See also the "Download Your Own Robot Scientist" article on Wired for a breakdown of the general idea of how it works.
Nice article,
I would say you're talking about "gene expression programming" rather than "genetic programming", btw.
Are you familiar with Core Wars? I remember there were a number of code evolvers written for the game which had some success. For example, MicroGP++ is an assembly code generator that can be applied to the Core Wars assembly language (as well as to real problems!).

Categories

Resources