There is function in python called eval that takes string input and evaluates it.
>>> x = 1
>>> print eval('x+1')
2
>>> print eval('12 + 32')
44
>>>
What is Haskell equivalent of eval function?
It is true that in Haskell, as in Java or C++ or similar languages, you can call out to the compiler, then dynamically load the code and execute it. However, this is generally heavy weight and almost never why people use eval() in other languages.
People tend to use eval() in a language because given that language's facilities, for certain classes of problem, it is easier to construct a string from the program input that resembles the language itself, rather than parse and evaluate the input directly.
For example, if you want to allow users to enter not just numbers in an input field, but simple arithmetic expressions, in Perl or Python it is going to be much easier to just call eval() on the input than writing a parser for the expression language you want to allow. Unfortunately, this kind of approach, almost always results in a poor user experience overall (compiler error messages weren't meant for non-programmers) and opens security holes. Solving these problems without using eval() generally involves a fair bit of code.
In Haskell, thanks to things like Parsec, it is actually very easy to write a parser and evaluator for these kinds of input problems, and considerably removes the yearning for eval.
It doesn't have an inbuilt eval function. However are some packages on hackage that can do the same sort of thing. (docs). Thanks to #luqui there is also hint.
There is no 'eval' built into the language, though Template Haskell allows compile time evaluation.
For runtime 'eval' -- i.e. runtime metaprogramming -- there are a number of packages on Hackage that essentially import GHC or GHCi, including the old hs-plugins package, and the hint package.
hs-plugins has System.Eval.Haskell.
This question is asked 11 years ago, and now use the package hint we can define eval very easily, following is an example as self-contained script (you still need nix to run it)
#!/usr/bin/env nix-shell
#! nix-shell -p "haskellPackages.ghcWithPackages (p: with p; [hint])"
#! nix-shell -i "ghci -ignore-dot-ghci -fdefer-type-errors -XTypeApplications"
{-# LANGUAGE ScopedTypeVariables, TypeApplications, PartialTypeSignatures #-}
import Data.Typeable (Typeable)
import qualified Language.Haskell.Interpreter as Hint
-- DOC: https://www.stackage.org/lts-18.18/package/hint-0.9.0.4
eval :: forall t. Typeable t => String -> IO t
eval s = do
mr <- Hint.runInterpreter $ do
Hint.setImports ["Prelude"]
Hint.interpret s (Hint.as :: t)
case mr of
Left err -> error (show err)
Right r -> pure r
-- * Interpret expressions into values:
e1 = eval #Int "1 + 1 :: Int"
e2 = eval #String "\"hello eval\""
-- * Send values from your compiled program to your interpreted program by interpreting a function:
e3 = do
f <- eval #(Int -> [Int]) "\\x -> [1..x]"
pure (f 5)
There is no eval equivalent, Haskell is a statically compiled language, same as C or C++ which does not have eval neither.
This answer shows a minimal example of using the hint package, but it lacks couple of things:
How to evaluate using bindings like let x = 1 in x + 1.
How to handle exceptions, specifically divide by zero.
Here is a more complete example:
import qualified Control.DeepSeq as DS
import Control.Exception (ArithException (..))
import qualified Control.Exception as Ex
import qualified Control.Monad as M
import qualified Data.Either as E
import qualified Language.Haskell.Interpreter as I
evalExpr :: String -> [(String, Integer)] -> IO (Maybe Integer)
evalExpr expr a = Ex.handle handler $ do
i <- I.runInterpreter $ do
I.setImports ["Prelude"]
-- let var = value works too
let stmts = map (\(var, val) -> var ++ " <- return " ++ show val) a
M.forM_ stmts $ \s -> do
I.runStmt s
I.interpret expr (I.as :: Integer)
-- without force, exception is not caught
(Ex.evaluate . DS.force) (E.either (const Nothing) Just i)
where
handler :: ArithException -> IO (Maybe Integer)
handler DivideByZero = return Nothing
handler ex = error $ show ex
Related
The print used to be a statement in Python 2, but now it became a function that requires parenthesis in Python 3.
Is there anyway to suppress these parenthesis in Python 3? Maybe by re-defining the print function?
So, instead of
print ("Hello stack over flowers")
I could type:
print "Hello stack over flowers"
Although you need a pair of parentheses to print in Python 3, you no longer need a space after print, because it's a function. So that's only a single extra character.
If you still find typing a single pair of parentheses to be "unnecessarily time-consuming," you can do p = print and save a few characters that way. Because you can bind new references to functions but not to keywords, you can only do this print shortcut in Python 3.
Python 2:
>>> p = print
File "<stdin>", line 1
p = print
^
SyntaxError: invalid syntax
Python 3:
>>> p = print
>>> p('hello')
hello
It'll make your code less readable, but you'll save those few characters every time you print something.
Using print without parentheses in Python 3 code is not a good idea. Nor is creating aliases, etc. If that's a deal breaker, use Python 2.
However, print without parentheses might be useful in the interactive shell. It's not really a matter of reducing the number of characters, but rather avoiding the need to press Shift twice every time you want to print something while you're debugging. IPython lets you call functions without using parentheses if you start the line with a slash:
Python 3.6.6 (default, Jun 28 2018, 05:43:53)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: var = 'Hello world'
In [2]: /print var
Hello world
And if you turn on autocall, you won't even need to type the slash:
In [3]: %autocall
Automatic calling is: Smart
In [4]: print var
------> print(var)
Hello world
Use Autohotkey to make a macro. AHK is free and dead simple to install. www.autohotkey.com
You could assign the macro to, say, alt-p:
!p::send print(){Left}
That will make alt-p put out print() and move your cursor to inside the parens.
Or, even better, to directly solve your problem, you define an autoreplace and limit its scope to when the open file has the .py extension:
#IfWinActive .py ;;; scope limiter
:b*:print ::print(){Left} ;;; I forget what b* does. The rest should be clear
#IfWinActive ;;; remove the scope limitation
This is a guaranteed, painless, transparent solution.
No. That will always be a syntax error in Python 3. Consider using 2to3 to translate your code to Python 3
The AHK script is a great idea. Just for those interested I needed to change it a little bit to work for me:
SetTitleMatchMode,2 ;;; allows for a partial search
#IfWinActive, .py ;;; scope limiter to only python files
:b*:print ::print(){Left} ;;; I forget what b* does
#IfWinActive ;;; remove the scope limitation
I finally figured out the regex to change these all in old Python2 example scripts. Otherwise use 2to3.py.
Try it out on Regexr.com, doesn't work in NP++(?):
find: (?<=print)( ')(.*)(')
replace: ('$2')
for variables:
(?<=print)( )(.*)(\n)
('$2')\n
for label and variable:
(?<=print)( ')(.*)(',)(.*)(\n)
('$2',$4)\n
You can use operator oveloading:
class Print:
def __lt__(self, thing):
print(thing)
p = Print()
p < 'hello' # -> hello
Or if you like use << like in c++ iostreams.
You can't, because the only way you could do it without parentheses is having it be a keyword, like in Python 2. You can't manually define a keyword, so no.
In Python 3, print is a function, whereas it used to be a statement in previous versions. As #holdenweb suggested, use 2to3 to translate your code.
I have configured vim to auto-add the parens around my debug def calls when I write the file. I use a simple watcher that auto-runs my updates when the timestamp changes. And I defined CTRL+p to delete them all. The only caveat is that I have to have them alone on a line, which I pretty much always do. I thought about trying to save and restore my previous search-highlight. And I did find a solution on Stack. But it looked a little too deep for now. Now I can finally get back to debugging Ruby-style with a quick p<space><obj>.
function! PyAddParensToDebugs()
execute "normal! mZ"
execute "%s/^\\(\\s*\\)\\(p\\|pe\\|pm\\|pp\\|e\\|ppe\\)\\( \\(.*\\)\\)\\{-}$/\\1\\2(\\4)/"
execute "normal! `Z"
endfunction
autocmd BufWrite *.py silent! call PyAddParensToDebugs()
map <c-p> :g/^\\s*\\(p\\\|pe\\\|pm\\\|pp\\\|e\\\|ppe\\)\\( \\\|\(\\)/d<cr>
I use these Python defs to do my debug printing.
def e():
exit()
def pp(obj):
pprint.PrettyPrinter(indent=4).pprint(obj)
def ppe(obj):
pprint.PrettyPrinter(indent=4).pprint(obj)
exit()
def p(obj):
print(repr(obj))
def pe(obj):
print(repr(obj))
exit()
def pm():
print("xxx")
Programming in C I used to have code sections only used for debugging purposes (logging commands and the like). Those statements could be completely disabled for production by using #ifdef pre-processor directives, like this:
#ifdef MACRO
controlled text
#endif /* MACRO */
What is the best way to do something similar in python?
If you just want to disable logging methods, use the logging module. If the log level is set to exclude, say, debug statements, then logging.debug will be very close to a no-op (it just checks the log level and returns without interpolating the log string).
If you want to actually remove chunks of code at bytecode compile time conditional on a particular variable, your only option is the rather enigmatic __debug__ global variable. This variable is set to True unless the -O flag is passed to Python (or PYTHONOPTIMIZE is set to something nonempty in the environment).
If __debug__ is used in an if statement, the if statement is actually compiled into only the True branch. This particular optimization is as close to a preprocessor macro as Python ever gets.
Note that, unlike macros, your code must still be syntactically correct in both branches of the if.
To show how __debug__ works, consider these two functions:
def f():
if __debug__: return 3
else: return 4
def g():
if True: return 3
else: return 4
Now check them out with dis:
>>> dis.dis(f)
2 0 LOAD_CONST 1 (3)
3 RETURN_VALUE
>>> dis.dis(g)
2 0 LOAD_GLOBAL 0 (True)
3 JUMP_IF_FALSE 5 (to 11)
6 POP_TOP
7 LOAD_CONST 1 (3)
10 RETURN_VALUE
>> 11 POP_TOP
3 12 LOAD_CONST 2 (4)
15 RETURN_VALUE
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
As you can see, only f is "optimized".
It is important to understand that in Python def and class are two regular executable statements...
import os
if os.name == "posix":
def foo(x):
return x * x
else:
def foo(x):
return x + 42
...
so to do what you do with preprocessor in C and C++ you can use the the regular Python language.
Python language is fundamentally different from C and C++ on this point because there exist no concept of "compile time" and the only two phases are "parse time" (when the source code is read in) and "run time" when the parsed code (normally mostly composed of definition statements but that is indeed arbitrary Python code) is executed.
I am using the term "parse time" even if technically when the source code is read in the transformation is a full compilation to bytecode because the semantic of C and C++ compilation is different and for example the definition of a function happens during that phase (while instead it happens at runtime in Python).
Even the equivalent of #include of C and C++ (that in Python is import) is a regular statement that is executed at run time and not at compile (parse) time so it can be placed inside a regular python if. Quite common is for example having an import inside a try block that will provide alternate definitions for some functions if a specific optional Python library is not present on the system.
Finally note that in Python you can even create new functions and classes at runtime from scratch by the use of exec, not necessarily having them in your source code. You can also assemble those objects directly using code because classes and functions are indeed just regular objects (this is normally done only for classes, however).
There are some tools that instead try to consider def and class definitions and import statements as "static", for example to do a static analysis of Python code to generate warnings on suspicious fragments or to create a self-contained deployable package that doesn't depend on having a specific Python installation on the system to run the program. All of them however need to be able to consider that Python is more dynamic than C or C++ in this area and they also allow adding exceptions for where the automatic analysis will fail.
Here is an example that I use to distinguish between Python 2 & 3 for my Python Tk programs:
import sys
if sys.version_info[0] == 3:
from tkinter import *
from tkinter import ttk
else:
from Tkinter import *
import ttk
""" rest of your code """
Hope that is a useful illustration.
As far as I am aware, you have to use actual if statements. There is no preprocessor, so there is no analogue to preprocessor directives.
Edit: Actually, it looks like the top answer to this question will be more illuminating: How would you do the equivalent of preprocessor directives in Python?
Supposedly there is a special variable __debug__ which, when used with an if statement, will be evaluated once and then not evaluated again during execution.
There is no direct equivalent that I'm aware of, so you might want to zoom-out and reconsider the problems that you used to solve using the preprocessor.
If it's just diagnostic logging you're after then there is a comprehensive logging module which should cover everything you wanted and more.
http://docs.python.org/library/logging.html
What else do you use the preprocessor for? Test configurations? There's a config module for that.
http://docs.python.org/library/configparser.html
Anything else?
If you are using #ifdef to check for variables that may have been defined in the scope above the current file, you can use exceptions. For example, I have scripts that I want to run differently from within ipython vs outside ipython (show plots vs save plots, for example). So I add
ipy = False
try:
ipy = __IPYTHON__
except NameError:
pass
This leaves me with a variable ipy, which tells me whether or not __IPYTHON__ was declared in a scope above my current script. This is the closest parallel I know of for an #ifdef function in Python.
For ipython, this is a great solution. You could use similar constructions in other contexts, in which a calling script sets variable values and the inner scripts check accordingly. Whether or not this makes sense, of course, would depend on your specific use case.
If you're working on Spyder, you probably only need this:
try:
print(x)
except:
#code to run under ifndef
x = "x is defined now!"
#other code
The first time, you run your script, you'll run the code under #code to run under ifndef, the second, you'll skip it.
Hope it works:)
This can be achieved by passing command line argument as below:
import sys
my_macro = 0
if(len(sys.argv) > 1):
for x in sys.argv:
if(x == "MACRO"):
my_macro = 1
if (my_macro == 1):
controlled text
Try running the following script and observe the results after this:
python myscript.py MACRO
Hope this helps.
I've encounter a following statement by Richard Stallman:
'When you start a Lisp system, it enters a read-eval-print loop. Most other languages have nothing comparable to read, nothing comparable to eval, and nothing comparable to print. What gaping deficiencies! '
Now, I did very little programming in Lisp, but I've wrote considerable amount of code in Python and recently a little in Erlang. My impression was that these languages also offer read-eval-print loop, but Stallman disagrees (at least about Python):
'I skimmed documentation of Python after people told me it was fundamentally similar to Lisp. My conclusion is that that is not so. When you start Lisp, it does 'read', 'eval', and 'print', all of which are missing in Python.'
Is there really a fundamental technical difference between Lisp's and Python's read-eval-print loops? Can you give examples of things that Lisp REPL makes easy and that are difficult to do in Python?
In support of Stallman's position, Python does not do the same thing as typical Lisp systems in the following areas:
The read function in Lisp reads an S-expression, which represents an arbitrary data structure that can either be treated as data, or evaluated as code. The closest thing in Python reads a single string, which you would have to parse yourself if you want it to mean anything.
The eval function in Lisp can execute any Lisp code. The eval function in Python evaluates only expressions, and needs the exec statement to run statements. But both these work with Python source code represented as text, and you have to jump through a bunch of hoops to "eval" a Python AST.
The print function in Lisp writes out an S-expression in exactly the same form that read accepts. print in Python prints out something defined by the data you're trying to print, which is certainly not always reversible.
Stallman's statement is a bit disingenuous, because clearly Python does have functions named exactly eval and print, but they do something different (and inferior) to what he expects.
In my opinion, Python does have some aspects similar to Lisp, and I can understand why people might have recommended that Stallman look into Python. However, as Paul Graham argues in What Made Lisp Different, any programming language that includes all the capabilities of Lisp, must also be Lisp.
Stallman's point is that not implementing an explicit "reader" makes Python's REPL appear crippled compared to Lisps because it removes a crucial step from the REPL process. Reader is the component that transforms a textual input stream into the memory — think of something like an XML parser built into the language and used for both source code and for data. This is useful not only for writing macros (which would in theory be possible in Python with the ast module), but also for debugging and introspection.
Say you're interested in how the incf special form is implemented. You can test it like this:
[4]> (macroexpand '(incf a))
(SETQ A (+ A 1)) ;
But incf can do much more than incrementing symbol values. What exactly does it do when asked to increment a hash table entry? Let's see:
[2]> (macroexpand '(incf (gethash htable key)))
(LET* ((#:G3069 HTABLE) (#:G3070 KEY) (#:G3071 (+ (GETHASH #:G3069 #:G3070) 1)))
(SYSTEM::PUTHASH #:G3069 #:G3070 #:G3071)) ;
Here we learn that incf calls a system-specific puthash function, which is an implementation detail of this Common Lisp system. Note how the "printer" is making use of features known to the "reader", such as introducing anonymous symbols with the #: syntax, and referring to the same symbols within the scope of the expanded expression. Emulating this kind of inspection in Python would be much more verbose and less accessible.
In addition to the obvious uses at the REPL, experienced Lispers use print and read in the code as a simple and readily available serialization tool, comparable to XML or json. While Python has the str function, equivalent to Lisp's print, it lacks the equivalent of read, the closest equivalent being eval. eval of course conflates two different concepts, parsing and evaluation, which leads to problems like this and solutions like this and is a recurring topic on Python forums. This would not be an issue in Lisp precisely because the reader and the evaluator are cleanly separated.
Finally, advanced features of the reader facility enable the programmer to extend the language in ways that even macros could not otherwise provide. A perfect example of such making hard things possible is the infix package by Mark Kantrowitz, implementing a full-featured infix syntax as a reader macro.
In a Lisp-based system one typically develops the program while it is running from the REPL (read eval print loop). So it integrates a bunch of tools: completion, editor, command-line-interpreter, debugger, ... The default is to have that. Type an expression with an error - you are in another REPL level with some debugging commands enabled. You actually have to do something to get rid of this behavior.
You can have two different meanings of the REPL concept:
the Read Eval Print Loop like in Lisp (or a few other similar languages). It reads programs and data, it evaluates and prints the result data. Python does not work this way. Lisp's REPL allows you to work directly in a meta-programming way, writing code which generates (code), check the expansions, transform actual code, etc.. Lisp has read/eval/print as the top loop. Python has something like readstring/evaluate/printstring as the top-loop.
the Command Line Interface. An interactive shell. See for example for IPython. Compare that to Common Lisp's SLIME.
The default shell of Python in default mode is not really that powerful for interactive use:
Python 2.7.2 (default, Jun 20 2012, 16:23:33)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a+2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>>
You get an error message and that's it.
Compare that to the CLISP REPL:
rjmba:~ joswig$ clisp
i i i i i i i ooooo o ooooooo ooooo ooooo
I I I I I I I 8 8 8 8 8 o 8 8
I \ `+' / I 8 8 8 8 8 8
\ `-+-' / 8 8 8 ooooo 8oooo
`-__|__-' 8 8 8 8 8
| 8 o 8 8 o 8 8
------+------ ooooo 8oooooo ooo8ooo ooooo 8
Welcome to GNU CLISP 2.49 (2010-07-07) <http://clisp.cons.org/>
Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2010
Type :h and hit Enter for context help.
[1]> (+ a 2)
*** - SYSTEM::READ-EVAL-PRINT: variable A has no value
The following restarts are available:
USE-VALUE :R1 Input a value to be used instead of A.
STORE-VALUE :R2 Input a new value for A.
ABORT :R3 Abort main loop
Break 1 [2]>
CLISP uses Lisp's condition system to break into a debugger REPL. It presents some restarts. Within the error context, the new REPL provides extended commands.
Let's use the :R1 restart:
Break 1 [2]> :r1
Use instead of A> 2
4
[3]>
Thus you get interactive repair of programs and execution runs...
Python's interactive mode differs from Python's "read code from file" mode in several, small, crucial ways, probably inherent in the textual representation of the language. Python is also not homoiconic, something that makes me call it "interactive mode" rather than "read-eval-print loop". That aside, I'd say that it is more a difference of grade than a difference in kind.
Now, something tahtactually comes close to "difference in kind", in a Python code file, you can easily insert blank lines:
def foo(n):
m = n + 1
return m
If you try to paste the identical code into the interpreter, it will consider the function to be "closed" and complain that you have a naked return statement at the wrong indentation. This does not happen in (Common) Lisp.
Furthermore, there are some rather handy convenience variables in Common Lisp (CL) that are not available (at least as far as I know) in Python. Both CL and Python have "value of last expression" (* in CL, _ in Python), but CL also has ** (value of expression before last) and *** (the value of the one before that) and +, ++ and +++ (the expressions themselves). CL also doesn't distinguish between expressions and statements (in essence, everything is an expression) and all of that does help build a much richer REPL experience.
As I said at the beginning, it is more a difference in grade than difference in kind. But had the gap been only a smidgen wider between them, it would probably be a difference in kind, as well.
It's not uncommon for an intro programming class to write a Lisp metacircular evaluator. Has there been any attempt at doing this for Python?
Yes, I know that Lisp's structure and syntax lends itself nicely to a metacircular evaluator, etc etc. Python will most likely be more difficult. I am just curious as to whether such an attempt has been made.
For those who don't know what a meta-circular evaluator is, it is an interpreter which is written in the language to be interpreted. For example: a Lisp interpreter written in Lisp, or in our case, a Python interpreter written in Python. For more information, read this chapter from SICP.
As JBernardo said, PyPy is one. However, PyPy's Python interpreter, the meta-circular evaluator that is, is implemented in a statically typed subset of Python called RPython.
You'll be pleased to know that, as of the 1.5 release, PyPy is fully compliant with the official Python 2.7 specification. Even more so: PyPy nearly always beats Python in performance benchmarks.
For more information see PyPy docs and PyPy extra docs.
I think i wrote one here:
"""
Metacircular Python interpreter with macro feature.
By Cees Timmerman, 14aug13.
"""
import re
re_macros = re.compile("^#define (\S+) ([^\r\n]+)", re.MULTILINE)
def meta_python_exec(code):
# Optional meta feature.
macros = re_macros.findall(code)
code = re_macros.sub("", code)
for m in macros:
code = code.replace(m[0], m[1])
# Run the code.
exec(code)
if __name__ == "__main__":
#code = open("metacircular_overflow.py", "r").read() # Causes a stack overflow in Python 3.2.3, but simply raises "RuntimeError: maximum recursion depth exceeded while calling a Python object" in Python 2.7.3.
code = "#define 1 2\r\nprint(1 + 1)"
meta_python_exec(code)
I want to programmatically edit python source code. Basically I want to read a .py file, generate the AST, and then write back the modified python source code (i.e. another .py file).
There are ways to parse/compile python source code using standard python modules, such as ast or compiler. However, I don't think any of them support ways to modify the source code (e.g. delete this function declaration) and then write back the modifying python source code.
UPDATE: The reason I want to do this is I'd like to write a Mutation testing library for python, mostly by deleting statements / expressions, rerunning tests and seeing what breaks.
Pythoscope does this to the test cases it automatically generates as does the 2to3 tool for python 2.6 (it converts python 2.x source into python 3.x source).
Both these tools uses the lib2to3 library which is an implementation of the python parser/compiler machinery that can preserve comments in source when it's round tripped from source -> AST -> source.
The rope project may meet your needs if you want to do more refactoring like transforms.
The ast module is your other option, and there's an older example of how to "unparse" syntax trees back into code (using the parser module). But the ast module is more useful when doing an AST transform on code that is then transformed into a code object.
The redbaron project also may be a good fit (ht Xavier Combelle)
The builtin ast module doesn't seem to have a method to convert back to source. However, the codegen module here provides a pretty printer for the ast that would enable you do do so.
eg.
import ast
import codegen
expr="""
def foo():
print("hello world")
"""
p=ast.parse(expr)
p.body[0].body = [ ast.parse("return 42").body[0] ] # Replace function body with "return 42"
print(codegen.to_source(p))
This will print:
def foo():
return 42
Note that you may lose the exact formatting and comments, as these are not preserved.
However, you may not need to. If all you require is to execute the replaced AST, you can do so simply by calling compile() on the ast, and execing the resulting code object.
Took a while, but Python 3.9 has this:
https://docs.python.org/3.9/whatsnew/3.9.html#ast
https://docs.python.org/3.9/library/ast.html#ast.unparse
ast.unparse(ast_obj)
Unparse an ast.AST object and generate a string with code that would produce an equivalent ast.AST object if parsed back with ast.parse().
In a different answer I suggested using the astor package, but I have since found a more up-to-date AST un-parsing package called astunparse:
>>> import ast
>>> import astunparse
>>> print(astunparse.unparse(ast.parse('def foo(x): return 2 * x')))
def foo(x):
return (2 * x)
I have tested this on Python 3.5.
You might not need to re-generate source code. That's a bit dangerous for me to say, of course, since you have not actually explained why you think you need to generate a .py file full of code; but:
If you want to generate a .py file that people will actually use, maybe so that they can fill out a form and get a useful .py file to insert into their project, then you don't want to change it into an AST and back because you'll lose all formatting (think of the blank lines that make Python so readable by grouping related sets of lines together) (ast nodes have lineno and col_offset attributes) comments. Instead, you'll probably want to use a templating engine (the Django template language, for example, is designed to make templating even text files easy) to customize the .py file, or else use Rick Copeland's MetaPython extension.
If you are trying to make a change during compilation of a module, note that you don't have to go all the way back to text; you can just compile the AST directly instead of turning it back into a .py file.
But in almost any and every case, you are probably trying to do something dynamic that a language like Python actually makes very easy, without writing new .py files! If you expand your question to let us know what you actually want to accomplish, new .py files will probably not be involved in the answer at all; I have seen hundreds of Python projects doing hundreds of real-world things, and not a single one of them needed to ever writer a .py file. So, I must admit, I'm a bit of a skeptic that you've found the first good use-case. :-)
Update: now that you've explained what you're trying to do, I'd be tempted to just operate on the AST anyway. You will want to mutate by removing, not lines of a file (which could result in half-statements that simply die with a SyntaxError), but whole statements — and what better place to do that than in the AST?
Parsing and modifying the code structure is certainly possible with the help of ast module and I will show it in an example in a moment. However, writing back the modified source code is not possible with ast module alone. There are other modules available for this job such as one here.
NOTE: Example below can be treated as an introductory tutorial on the usage of ast module but a more comprehensive guide on using ast module is available here at Green Tree snakes tutorial and official documentation on ast module.
Introduction to ast:
>>> import ast
>>> tree = ast.parse("print 'Hello Python!!'")
>>> exec(compile(tree, filename="<ast>", mode="exec"))
Hello Python!!
You can parse the python code (represented in string) by simply calling the API ast.parse(). This returns the handle to Abstract Syntax Tree (AST) structure. Interestingly you can compile back this structure and execute it as shown above.
Another very useful API is ast.dump() which dumps the whole AST in a string form. It can be used to inspect the tree structure and is very helpful in debugging. For example,
On Python 2.7:
>>> import ast
>>> tree = ast.parse("print 'Hello Python!!'")
>>> ast.dump(tree)
"Module(body=[Print(dest=None, values=[Str(s='Hello Python!!')], nl=True)])"
On Python 3.5:
>>> import ast
>>> tree = ast.parse("print ('Hello Python!!')")
>>> ast.dump(tree)
"Module(body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Str(s='Hello Python!!')], keywords=[]))])"
Notice the difference in syntax for print statement in Python 2.7 vs. Python 3.5 and the difference in type of AST node in respective trees.
How to modify code using ast:
Now, let's a have a look at an example of modification of python code by ast module. The main tool for modifying AST structure is ast.NodeTransformer class. Whenever one needs to modify the AST, he/she needs to subclass from it and write Node Transformation(s) accordingly.
For our example, let's try to write a simple utility which transforms the Python 2 , print statements to Python 3 function calls.
Print statement to Fun call converter utility: print2to3.py:
#!/usr/bin/env python
'''
This utility converts the python (2.7) statements to Python 3 alike function calls before running the code.
USAGE:
python print2to3.py <filename>
'''
import ast
import sys
class P2to3(ast.NodeTransformer):
def visit_Print(self, node):
new_node = ast.Expr(value=ast.Call(func=ast.Name(id='print', ctx=ast.Load()),
args=node.values,
keywords=[], starargs=None, kwargs=None))
ast.copy_location(new_node, node)
return new_node
def main(filename=None):
if not filename:
return
with open(filename, 'r') as fp:
data = fp.readlines()
data = ''.join(data)
tree = ast.parse(data)
print "Converting python 2 print statements to Python 3 function calls"
print "-" * 35
P2to3().visit(tree)
ast.fix_missing_locations(tree)
# print ast.dump(tree)
exec(compile(tree, filename="p23", mode="exec"))
if __name__ == '__main__':
if len(sys.argv) <=1:
print ("\nUSAGE:\n\t print2to3.py <filename>")
sys.exit(1)
else:
main(sys.argv[1])
This utility can be tried on small example file, such as one below, and it should work fine.
Test Input file : py2.py
class A(object):
def __init__(self):
pass
def good():
print "I am good"
main = good
if __name__ == '__main__':
print "I am in main"
main()
Please note that above transformation is only for ast tutorial purpose and in real case scenario one will have to look at all different scenarios such as print " x is %s" % ("Hello Python").
If you are looking at this in 2019, then you can use this libcst
package. It has syntax similar to ast. This works like a charm, and preserve the code structure. It's basically helpful for the project where you have to preserve comments, whitespace, newline etc.
If you don't need to care about the preserving comments, whitespace and others, then the combination of ast and astor works well.
I've created recently quite stable (core is really well tested) and extensible piece of code which generates code from ast tree: https://github.com/paluh/code-formatter .
I'm using my project as a base for a small vim plugin (which I'm using every day), so my goal is to generate really nice and readable python code.
P.S.
I've tried to extend codegen but it's architecture is based on ast.NodeVisitor interface, so formatters (visitor_ methods) are just functions. I've found this structure quite limiting and hard to optimize (in case of long and nested expressions it's easier to keep objects tree and cache some partial results - in other way you can hit exponential complexity if you want to search for best layout). BUT codegen as every piece of mitsuhiko's work (which I've read) is very well written and concise.
One of the other answers recommends codegen, which seems to have been superceded by astor. The version of astor on PyPI (version 0.5 as of this writing) seems to be a little outdated as well, so you can install the development version of astor as follows.
pip install git+https://github.com/berkerpeksag/astor.git#egg=astor
Then you can use astor.to_source to convert a Python AST to human-readable Python source code:
>>> import ast
>>> import astor
>>> print(astor.to_source(ast.parse('def foo(x): return 2 * x')))
def foo(x):
return 2 * x
I have tested this on Python 3.5.
Unfortunately none of the answers above actually met both of these conditions
Preserve the syntactical integrity for the surrounding source code (e.g keeping comments, other sorts of formatting for the rest of the code)
Actually use AST (not CST).
I've recently written a small toolkit to do pure AST based refactorings, called refactor. For example if you want to replace all placeholders with 42, you can simply write a rule like this;
class Replace(Rule):
def match(self, node):
assert isinstance(node, ast.Name)
assert node.id == 'placeholder'
replacement = ast.Constant(42)
return ReplacementAction(node, replacement)
And it will find all acceptable nodes, replace them with the new nodes and generate the final form;
--- test_file.py
+++ test_file.py
## -1,11 +1,11 ##
def main():
- print(placeholder * 3 + 2)
- print(2 + placeholder + 3)
+ print(42 * 3 + 2)
+ print(2 + 42 + 3)
# some commments
- placeholder # maybe other comments
+ 42 # maybe other comments
if something:
other_thing
- print(placeholder)
+ print(42)
if __name__ == "__main__":
main()
We had a similar need, which wasn't solved by other answers here. So we created a library for this, ASTTokens, which takes an AST tree produced with the ast or astroid modules, and marks it with the ranges of text in the original source code.
It doesn't do modifications of code directly, but that's not hard to add on top, since it does tell you the range of text you need to modify.
For example, this wraps a function call in WRAP(...), preserving comments and everything else:
example = """
def foo(): # Test
'''My func'''
log("hello world") # Print
"""
import ast, asttokens
atok = asttokens.ASTTokens(example, parse=True)
call = next(n for n in ast.walk(atok.tree) if isinstance(n, ast.Call))
start, end = atok.get_text_range(call)
print(atok.text[:start] + ('WRAP(%s)' % atok.text[start:end]) + atok.text[end:])
Produces:
def foo(): # Test
'''My func'''
WRAP(log("hello world")) # Print
Hope this helps!
A Program Transformation System is a tool that parses source text, builds ASTs, allows you to modify them using source-to-source transformations ("if you see this pattern, replace it by that pattern"). Such tools are ideal for doing mutation of existing source codes, which are just "if you see this pattern, replace by a pattern variant".
Of course, you need a program transformation engine that can parse the language of interest to you, and still do the pattern-directed transformations. Our DMS Software Reengineering Toolkit is a system that can do that, and handles Python, and a variety of other languages.
See this SO answer for an example of a DMS-parsed AST for Python capturing comments accurately. DMS can make changes to the AST, and regenerate valid text, including the comments. You can ask it to prettyprint the AST, using its own formatting conventions (you can changes these), or do "fidelity printing", which uses the original line and column information to maximally preserve the original layout (some change in layout where new code is inserted is unavoidable).
To implement a "mutation" rule for Python with DMS, you could write the following:
rule mutate_addition(s:sum, p:product):sum->sum =
" \s + \p " -> " \s - \p"
if mutate_this_place(s);
This rule replace "+" with "-" in a syntactically correct way; it operates on the AST and thus won't touch strings or comments that happen to look right. The extra condition on "mutate_this_place" is to let you control how often this occurs; you don't want to mutate every place in the program.
You'd obviously want a bunch more rules like this that detect various code structures, and replace them by the mutated versions. DMS is happy to apply a set of rules. The mutated AST is then prettyprinted.
I used to use baron for this, but have now switched to parso because it's up to date with modern python. It works great.
I also needed this for a mutation tester. It's really quite simple to make one with parso, check out my code at https://github.com/boxed/mutmut