I've been given a Python code, together with the modules it imports. I would like to build a tree indicating which function calls what other functions. How can I do that?
you can use the ast (abstract syntax tree) module from the python standard library
# foo.py
def func(x):
print('hello')
parsing the file using ast.parse:
import ast
tree = ast.parse(open('foo.py').read())
print(ast.dump(tree)) # dumps the whole tree
# get the function from the tree body (i.e. from the file's content)
func = tree.body[0]
# get the function argument names
arguments = [a.arg for a in func.args.args]
print('the functions is: %s(%s)' % (func.name, ', '.join(arguments)))
outputs:
"Module(body=[FunctionDef(name='func', args=arguments(args=[arg(arg='x', annotation=None)], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Str(s='hello')], keywords=[]))], decorator_list=[], returns=None)])"
the functions is: func(x)
You should begin from the main function of the program and at the first layer link all functions that are called from the main this would provide a start point and then you can link all the functions below it.
Related
I am attempting to implement a decorator that receives a function, parses it into an AST, eventually will do something to the AST, then reconstruct the original (or modified) function from the AST and return it. My current approach is, once I have the AST, compile it to a code <module> object, then get the constant in it with the name of the function, convert it to FunctionType, and return it. I have the following:
import ast, inspect, types
def as_ast(f):
source = inspect.getsource(f)
source = '\n'.join(source.splitlines()[1:]) # Remove as_ast decoration, pretend there can be no other decorations for now
tree = ast.parse(source)
print(ast.dump(tree, indent=4)) # Debugging log
# I would modify the AST somehow here
filename = f.__code__.co_filename
code = compile(tree, filename, 'exec')
func_code = next(
filter(
lambda x: isinstance(x, types.CodeType) and x.co_name == f.__name__,
code.co_consts)) # Get function object
func = types.FunctionType(func_code, {})
return func
#as_ast
def test(arg: int=4):
print(f'{arg=}')
Now, I would expect that calling test later in this source code will simply have the effect of calling test if the decorator were absent, which is what I observe, so long as I pass an argument for arg. However, if I pass no argument, instead of using the default I gave (4), it throws a TypeError for the missing argument. This makes it pretty clear that my approach for getting a callable function from the AST is not quite correct, as the default argument is not applied, and there may be other details that would slip through as it is now. How might I be able to correctly recreate the function from the AST? The way I currently go from the code module object to the function code object also seems... off intuitively, but I do not know how else one might achieve this.
The root node of the AST is a Module. Calling compile() on the AST, results in a code object for a module. Looking at the compiled code object returned using dis.dis(), from the standard library, shows the module level code builds the function and stores it in the global name space. So the easiest thing to do is exec the compiled code and then get the function from the 'global' environment of the exec call.
The AST node for the function includes a list of the decorators to be applied to the function. Any decorators that haven't been applied yet should be deleted from the list so they don't get applied twice (once when this decorator compiles the code, and once after this decorator returns). And delete this decorator from the list or you'll get an infinite recursion. The question is what to do with any decorators that came before this one. They have already run, but their result is tossed out because this decorator (as_ast) goes back to the source code. You can leave them in the list so they get rerun, or delete them if they don't matter.
In the code below, all the decorators are deleted from the parse tree, under the assumption that the as_ast decorator is applied first. The call to exec() uses a copy of globals() so the decorator has access to any other globally visible names (variables, functions, etc). See the docs for exec() for other considerations. Uncommented the print statements to see what is going on.
import ast
import dis
import inspect
import types
def as_ast(f):
source = inspect.getsource(f)
#print(f"=== source ===\n{source}")
tree = ast.parse(source)
#print(f"\n=== original ===\n{ast.dump(tree, indent=4)}")
# Remove the decorators from the AST, because the modified function will
# be passed to them anyway and we don't want them to be called twice.
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
node.decorator_list.clear()
# Make modifications to the AST here
#print(f"\n=== revised ===\n{ast.dump(tree, indent=4)}")
name = f.__code__.co_name
code = compile(tree, name, 'exec')
#print("\n=== byte code ===")
#dis.dis(code)
#print()
temp_globals = dict(globals())
exec(code, temp_globals)
return temp_globals[name]
Note: this decorator has not been tested much and has not been tested at all on methods or nested functions.
An interesting idea would be to for as_ast to return the AST. Then subsequent decorators could manipulate the AST. Lastly, a from_ast decorator could compile the modified AST into a function.
When defining code dynamically in Python (e.g. through exec or loading it from some other medium other than import), I am unable to get to the source of the defined function.
inspect.getsource seems to look for a loaded module from where it was loaded.
import inspect
code = """
def my_function():
print("Hello dears")
"""
exec(code)
my_function() #Works, as expected
print(inspect.getsource(my_function)) ## Fails with OSError('could not get source code')
Is there any other way to get at the source of a dynamically interpreted function (or other object, for that matter)?
Is there any other way to get at the source of a dynamically interpreted function (or other object, for that matter)?
One option would be to dump the source to a file and exec from there, though that litters your filesystem with garbage you need to cleanup.
A somewhat less reliable but less garbagey alternative would be to rebuild the source (-ish) from the bytecode, using astor.to_source() for instance. It will give you a "corresponding" source but may alter formatting or lose metadata compared to the original.
The simplest would be to simply attach your original source to the created function object:
code = """
def my_function():
print("Hello dears")
"""
exec(code)
my_function.__source__ = code # has nothing to do with getsource
One more alternative (though probably not useful here as I assume you want the body to be created dynamically from a template for instance) would be to swap the codeobject for one you've updated with the correct / relevant firstlineno (and optionally filename though you can set that as part of the compile statement). That's only useful if for some weird reason you have your python code literally embedded in an other file but can't or don't want to extract it to its own module for a normal evaluation.
You can do it almost like below
import inspect
source = """
def foo():
print("Hello World")
"""
file_name ='/tmp/foo.py' # you can use any hash_function
with open(file_name, 'w') as f:
f.write(source)
code = compile(source, file_name, 'exec')
exec(code)
foo() # Works, as expected
print(inspect.getsource(foo))
I'm writing a wrapper or pipeline to create a tfrecords dataset to which I would like to supply a function to apply to the dataset.
I would like to make it possible for the user to inject a function defined in another python file which is called in my script to transform the data.
Why? The only thing the user has to do is write the function which brings his data into the right format, then the existing code does the rest.
I'm aware of the fact that I could have the user write the function in the same file and call it, or to have an import statement etc.
So as a minimal example, I would like to have file y.py
def main(argv):
# Parse args etc, let's assume it is there.
dataset = tf.data.TFRecordDataset(args.filename)
dataset = dataset.map(args.function)
# Continue with doing stuff that is independent from actual content
So what I'd like to be able to do is something like this
python y.py --func x.py my_func
And use the function defined in x.py my_func in dataset.map(...)
Is there a way to do this in python and if yes, which is the best way to do it?
Pass the name of the file as an argument to your script (and function name)
Read the file into a string, possibly extracting the given function
use Python exec() to execute the code
An example:
file = "def fun(*args): \n return args"
func = "fun(1,2,3)"
def execute(func, file):
program = file + "\nresult = " + func
local = {}
exec(program, local)
return local['result']
r = execute(func, file)
print(r)
Similar to here however we must use locals() as we are not calling exec in global scope.
Note: the use of exec is somewhat dangerous, you should be sure that the function is safe - if you are using it then its fine!
Hope this helps.
Ok so I have composed the answer myself now using the information from comments and this answer.
import importlib, inspect, sys, os
# path is given path to file, funcion_name is name of function and args are the function arguments
# Create package and module name from path
package = os.path.dirname(path).replace(os.path.sep,'.')
module_name = os.path.basename(path).split('.')[0]
# Import module and get members
module = importlib.import_module(module_name, package)
members = inspect.getmembers(module)
# Find matching function
function = [t[1] for t in members if t[0] == function_name][0]
function(args)
This exactly solves the question, since I get a callable function object which I can call, pass around, use it as a normal function.
I have a file that contains several python functions, each with some statements.
def func1():
codeX...
def func2():
codeY...
codeX and codeY can be multiple statements. I want to be able to parse the file, find a function by name, then evaluate the code in that function.
With the ast module, I can parse the file, find the FunctionDef objects, and get the list of Stmt objects, but how do I turn this into bytecode that I can pass to eval? Should I use the compile module, or the parser module instead?
Basically, the function defs are just used to create separate blocks of code. I want to be able to grab any block of code given the name and then execute that code in eval (providing my own local/global scope objects). If there is a better way to do this than what I described that would be helpful too.
Thanks
I want to be able to grab any block of code given the name and then execute that code ... (providing my own local/global scope objects).
A naive solution looks like this. This is based on the assumption that the functions don't all depend on global variables.
from file_that_contains_several_python_functions import *
Direction = some_value
func1()
func2()
func3()
That should do exactly what you want.
However, if all of your functions rely on global variables -- a design that calls to mind 1970's-era FORTRAN -- then you have to do something slightly more complex.
from file_that_contains_several_python_functions import *
Direction = some_value
func1( globals() )
func2( globals() )
func3( globals() )
And you have to rewrite all of your global-using functions like this.
def func1( context )
globals().update( context )
# Now you have access to all kinds of global variables
This seems ugly because it is. Functions which rely entirely on global variables are not really the best idea.
Using Python 2.6.4:
text = """
def fun1():
print 'fun1'
def fun2():
print 'fun2'
"""
import ast
tree = ast.parse(text)
# tree.body[0] contains FunctionDef for fun1, tree.body[1] for fun2
wrapped = ast.Interactive(body=[a.body[1]])
code = compile(wrapped, 'yourfile', 'single')
eval(code)
fun2() # prints 'fun2'
Take a look at grammar in ast doc: http://docs.python.org/library/ast.html#abstract-grammar. Top-level statement must be either Module, Interactive or Expression, so you need to wrap function def in one of those.
If you're using Python 2.6 or later, then the compile() function accepts AST objects in addition to source code.
>>> import ast
>>> a = ast.parse("print('hello world')")
>>> x = compile(a, "(none)", "exec")
>>> eval(x)
hello world
These modules have all been rearranged for Python 3.
I'm making a script parser in python and I'm a little stuck. I am not quite sure how to parse a line for all its functions (or even just one function at a time) and then search for a function with that name, and if it exists, execute that function short of writing a massive list if elif else block....
EDIT
This is for my own scripting language that i'm making. its nothing very complex, but i have a standard library of 8 functions or so that i need to be able to be run, how can i parse a line and run the function named in the line?
Once you get the name of the function, use a dispatch dict to run the function:
def mysum(...): ...
def myotherstuff(...): ...
# create dispatch dict:
myfunctions = {'sum': mysum, 'stuff': myotherstuff}
# run your parser:
function_name, parameters = parse_result(line)
# run the function:
myfunctions[function_name](parameters)
Alternatively create a class with the commands:
class Commands(object):
def do_sum(self, ...): ...
def do_stuff(self, ...): ...
def run(self, funcname, params):
getattr(self, 'do_' + funcname)(params)
cmd = Commands()
function_name, parameters = parse_result(line)
cmd.run(function_name, parameters)
You could also look at the cmd module in the stdlib to do your class. It can provide you with a command-line interface for your language, with tab command completion, automatically.
Check out PyParsing, it allows for definition of the grammar directly in Python code:
Assuming a function call is just somename():
>>> from pyparsing import *
>>> grammar = Word(alphas + "_", alphanums + "_")("func_name") + "()" + StringEnd()
>>> grammar.parseString("ab()\n")["func_name"]
"ab"
Take a look at PLY. It should help you keep your parser specification clean.
It all depends on what code you are parsing.
If you are parsing Python syntax, use the parser module from Python:
http://docs.python.org/library/parser.html
A quite complete list of parser libraries available for Python you can find at: http://nedbatchelder.com/text/python-parsers.html