I am attempting to implement a decorator that receives a function, parses it into an AST, eventually will do something to the AST, then reconstruct the original (or modified) function from the AST and return it. My current approach is, once I have the AST, compile it to a code <module> object, then get the constant in it with the name of the function, convert it to FunctionType, and return it. I have the following:
import ast, inspect, types
def as_ast(f):
source = inspect.getsource(f)
source = '\n'.join(source.splitlines()[1:]) # Remove as_ast decoration, pretend there can be no other decorations for now
tree = ast.parse(source)
print(ast.dump(tree, indent=4)) # Debugging log
# I would modify the AST somehow here
filename = f.__code__.co_filename
code = compile(tree, filename, 'exec')
func_code = next(
filter(
lambda x: isinstance(x, types.CodeType) and x.co_name == f.__name__,
code.co_consts)) # Get function object
func = types.FunctionType(func_code, {})
return func
#as_ast
def test(arg: int=4):
print(f'{arg=}')
Now, I would expect that calling test later in this source code will simply have the effect of calling test if the decorator were absent, which is what I observe, so long as I pass an argument for arg. However, if I pass no argument, instead of using the default I gave (4), it throws a TypeError for the missing argument. This makes it pretty clear that my approach for getting a callable function from the AST is not quite correct, as the default argument is not applied, and there may be other details that would slip through as it is now. How might I be able to correctly recreate the function from the AST? The way I currently go from the code module object to the function code object also seems... off intuitively, but I do not know how else one might achieve this.
The root node of the AST is a Module. Calling compile() on the AST, results in a code object for a module. Looking at the compiled code object returned using dis.dis(), from the standard library, shows the module level code builds the function and stores it in the global name space. So the easiest thing to do is exec the compiled code and then get the function from the 'global' environment of the exec call.
The AST node for the function includes a list of the decorators to be applied to the function. Any decorators that haven't been applied yet should be deleted from the list so they don't get applied twice (once when this decorator compiles the code, and once after this decorator returns). And delete this decorator from the list or you'll get an infinite recursion. The question is what to do with any decorators that came before this one. They have already run, but their result is tossed out because this decorator (as_ast) goes back to the source code. You can leave them in the list so they get rerun, or delete them if they don't matter.
In the code below, all the decorators are deleted from the parse tree, under the assumption that the as_ast decorator is applied first. The call to exec() uses a copy of globals() so the decorator has access to any other globally visible names (variables, functions, etc). See the docs for exec() for other considerations. Uncommented the print statements to see what is going on.
import ast
import dis
import inspect
import types
def as_ast(f):
source = inspect.getsource(f)
#print(f"=== source ===\n{source}")
tree = ast.parse(source)
#print(f"\n=== original ===\n{ast.dump(tree, indent=4)}")
# Remove the decorators from the AST, because the modified function will
# be passed to them anyway and we don't want them to be called twice.
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
node.decorator_list.clear()
# Make modifications to the AST here
#print(f"\n=== revised ===\n{ast.dump(tree, indent=4)}")
name = f.__code__.co_name
code = compile(tree, name, 'exec')
#print("\n=== byte code ===")
#dis.dis(code)
#print()
temp_globals = dict(globals())
exec(code, temp_globals)
return temp_globals[name]
Note: this decorator has not been tested much and has not been tested at all on methods or nested functions.
An interesting idea would be to for as_ast to return the AST. Then subsequent decorators could manipulate the AST. Lastly, a from_ast decorator could compile the modified AST into a function.
I was reading Ian Goodfellow's GAN source code in Github (link https://github.com/goodfeli/adversarial/blob/master/deconv.py). In particular, at line 40/41, the code is:
#functools.wraps(Model.get_lr_scalers)
def get_lr_scalers(self):
It's a rather unfamiliar way of using wraps, and it seems the goal is to replace the get_lr_scalers with a user defined function. But in that case, we don't really need a wrapper for that, right? I don't really know the purpose of wraps in this case.
wraps copies a number of attributes from another function onto this function—by default, __module__, __name__, __qualname__, __annotations__ and __doc__.
The most obviously useful one to copy over is the __doc__. Consider this simpler example:1
class Base:
def spam(self, breakfast):
"""spam(self, breakfast) -> breakfast with added spam
<29 lines of detailed information here>
"""
class Child:
#functools.wraps(Base.spam)
def spam(self, breakfast):
newbreakfast = breakfast.copy()
newbreakfast.meats['spam'] + 30
return newbreakfast
Now if someone wants to use help(mychild.spam), they'll get the 29 lines of useful information. (Or, if they autocomplete mychild.spam in PyCharm, it'll pop up the overlay with the documentation, etc.) All without me having to manually copy and paste it. And, even better, if Base came from some framework that I didn't write, and my user upgrades from 1.2.3 to 1.2.4 of that framework, and there's a better docstring, they'll see that better docstring.
In the most common case, Child would be a subclass of Base, and spam would be an override.2 But that isn't actually required—wraps doesn't care whether you're subtyping via inheritance, or duck typing by just implementing an implicit protocol; it's equally useful for both cases. As long as Child is intended to implement the spam protocol from Base, it makes sense for Child.spam to have the same docstring (and maybe other metadata attributes).
Others attributes probably aren't quite as useful as docstrings. For example, if you're using type annotations, their benefit in reading the code is probably at least as high as their benefit in being able to run Mypy for static type checking, so just copying them over dynamically from another method often isn't all that useful. And __module__ and __qualname__ are primarily used for reflection/inspection, and are more likely to be misleading than helpful in this case (although you could probably come up with an example of a framework where you'd want people to read the code in Base instead of the code in Child, that isn't true for the default obvious example). But, unless they're actively harmful, the readability cost of using #functools.wraps(Base.spam, assigned=('__doc__',)) instead of just the defaults may not be worth it.
1. If you're using Python 2, change these classes to inherit from object; otherwise they'll be old-style classes, which just complicates things in an irrelevant way. If Python 3, there are no old-style classes, so this issue can't even arise.
2. Or maybe a "virtual subclass" of an ABC, declared via a register call, or via a subclass hook.
The purpose of #wraps is to copy meta information of one function to another function. This is usually done when replacing the original function by wrapping it, which is often done by decorators.
But in general case, here is what it does in an example:
def f1():
"""Function named f1. Prints 'f1'."""
print('f1')
#functools.wraps(f1)
def f2():
print('f2')
Now, you can test what happened:
>>> f1
<function f1 at 0x006AD8E8>
>>> f2
<function f1 at 0x006AD978>
>>> f1()
f1
>>> f2()
f2
>>> f1.__doc__
"Function named f1. Prints 'f1'."
>>> f2.__doc__
"Function named f1. Prints 'f1'."
When you call f2, it is obvious that it is actually f2, but when you inspect it, it behaves like f1 - it has the same doc string and the same name.
What is that good for? For this:
f1 = f2
Now the original f1 is replaced with a new functionality, but it still looks like f1 from the outside.
It is usually done in a decorator:
def replace(func):
#functools.wraps(func)
def replacement():
print('replacement')
return replacement
#replace
def f1():
"""Function named f1. Prints 'f1'."""
print('f1')
And it behaves like this:
>>> f1()
replacement
>>> f1
<function f1 at 0x006AD930>
>>> f1.__name__
'f1'
>>> f1.__doc__
"Function named f1. Prints 'f1'."
I am editing PROSS.py to work with .cif files for protein structures. Inside the existing PROSS.py, there is the following functions (I believe that's the correct name if it's not associated with any class?), just existing within the .py file:
...
def unpack_pdb_line(line, ATOF=_atof, ATOI=_atoi, STRIP=string.strip):
...
...
def read_pdb(f, as_protein=0, as_rna=0, as_dna=0, all_models=0,
unpack=unpack_pdb_line, atom_build=atom_build):
I am adding an optons parser for command line arguments, and one of the options is to specify an alternate method to use besides unpack_pdb_line. So the pertinent part of the options parser is:
...
parser.add_option("--un", dest="unpack_method", default="unpack_pdb_line", type="string", help="Unpack method to use. Default is unpack_pdb_line.")
...
unpack=options.unpack_method
However, options.unpack_method is a string and I need to use the function with the same name as the string inside options.unpack_method. How do I use getattr etc to convert the string into the actual function name?
Thanks,
Paul
Usually you just use a dict and store (func_name, function) pairs:
unpack_options = { 'unpack_pdb_line' : unpack_pdb_line,
'some_other' : some_other_function }
unpack_function = unpack_options[options.unpack_method]
If you want to exploit the dictionaries (&c) that Python's already keeping on your behalf, I'd suggest:
def str2fun(astr):
module, _, function = astr.rpartition('.')
if module:
__import__(module)
mod = sys.modules[module]
else:
mod = sys.modules['__main__'] # or whatever's the "default module"
return getattr(mod, function)
You'll probably want to check the function's signature (and catch exceptions to provide nicer error messages) e.g. via inspect, but this is a useful general-purpose function.
It's easy to add a dictionary of shortcuts, as a fallback, if some known functions full string names (including module/package qualifications) are unwieldy to express this way.
Note we don't use __import__'s result (it doesn't work right when the function is in a module inside some package, as __import__ returns the top-level name of the package... just accessing sys.modules after the import is more practical).
vars()["unpack_pdb_line"]() will work too.
or
globals() or locals() will also work similar way.
>>> def a():return 1
>>>
>>> vars()["a"]
<function a at 0x009D1230>
>>>
>>> vars()["a"]()
1
>>> locals()["a"]()
1
>>> globals()["a"]()
1
Cheers,
If you are taking input from a user, for the sake of security it is probably best to
use a hand-made dict which will accept only a well-defined set of admissible user inputs:
unpack_options = { 'unpack_pdb_line' : unpack_pdb_line,
'unpack_pdb_line2' : unpack_pdb_line2,
}
Ignoring security for a moment, let us note in passing that an easy way to
go from (strings of variable names) to (the value referenced by the variable name)
is to use the globals() builtin dict:
unpack_function=globals()['unpack_pdb_line']
Of course, that will only work if the variable unpack_pdb_line is in the global namespace.
If you need to reach into a packgae for a module, or a module for a variable, then
you could use this function
import sys
def str_to_obj(astr):
print('processing %s'%astr)
try:
return globals()[astr]
except KeyError:
try:
__import__(astr)
mod=sys.modules[astr]
return mod
except ImportError:
module,_,basename=astr.rpartition('.')
if module:
mod=str_to_obj(module)
return getattr(mod,basename)
else:
raise
You could use it like this:
str_to_obj('scipy.stats')
# <module 'scipy.stats' from '/usr/lib/python2.6/dist-packages/scipy/stats/__init__.pyc'>
str_to_obj('scipy.stats.stats')
# <module 'scipy.stats.stats' from '/usr/lib/python2.6/dist-packages/scipy/stats/stats.pyc'>
str_to_obj('scipy.stats.stats.chisquare')
# <function chisquare at 0xa806844>
It works for nested packages, modules, functions, or (global) variables.
function = eval_dottedname(name if '.' in name else "%s.%s" % (__name__, name))
Where eval_dottedname():
def eval_dottedname(dottedname):
"""
>>> eval_dottedname("os.path.join") #doctest: +ELLIPSIS
<function join at 0x...>
>>> eval_dottedname("sys.exit") #doctest: +ELLIPSIS
<built-in function exit>
>>> eval_dottedname("sys") #doctest: +ELLIPSIS
<module 'sys' (built-in)>
"""
return reduce(getattr, dottedname.split(".")[1:],
__import__(dottedname.partition(".")[0]))
eval_dottedname() is the only one among all answers that supports arbitrary names with multiple dots in them e.g., `'datetime.datetime.now'. Though it doesn't work for nested modules that require import, but I can't even remember an example from stdlib for such module.
I'm starting to code in various projects using Python (including Django web development and Panda3D game development).
To help me understand what's going on, I would like to basically 'look' inside the Python objects to see how they tick - like their methods and properties.
So say I have a Python object, what would I need to print out its contents? Is that even possible?
Python has a strong set of introspection features.
Take a look at the following built-in functions:
type()
dir()
id()
getattr()
hasattr()
globals()
locals()
callable()
type() and dir() are particularly useful for inspecting the type of an object and its set of attributes, respectively.
object.__dict__
I'm surprised no one's mentioned help yet!
In [1]: def foo():
...: "foo!"
...:
In [2]: help(foo)
Help on function foo in module __main__:
foo()
foo!
Help lets you read the docstring and get an idea of what attributes a class might have, which is pretty helpful.
First, read the source.
Second, use the dir() function.
If this is for exploration to see what's going on, I'd recommend looking at IPython. This adds various shortcuts to obtain an objects documentation, properties and even source code. For instance appending a "?" to a function will give the help for the object (effectively a shortcut for "help(obj)", wheras using two ?'s ("func??") will display the sourcecode if it is available.
There are also a lot of additional conveniences, like tab completion, pretty printing of results, result history etc. that make it very handy for this sort of exploratory programming.
For more programmatic use of introspection, the basic builtins like dir(), vars(), getattr etc will be useful, but it is well worth your time to check out the inspect module. To fetch the source of a function, use "inspect.getsource" eg, applying it to itself:
>>> print inspect.getsource(inspect.getsource)
def getsource(object):
"""Return the text of the source code for an object.
The argument may be a module, class, method, function, traceback, frame,
or code object. The source code is returned as a single string. An
IOError is raised if the source code cannot be retrieved."""
lines, lnum = getsourcelines(object)
return string.join(lines, '')
inspect.getargspec is also frequently useful if you're dealing with wrapping or manipulating functions, as it will give the names and default values of function parameters.
If you're interested in a GUI for this, take a look at objbrowser. It uses the inspect module from the Python standard library for the object introspection underneath.
You can list the attributes of a object with dir() in the shell:
>>> dir(object())
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
Of course, there is also the inspect module: http://docs.python.org/library/inspect.html#module-inspect
Try ppretty
from ppretty import ppretty
class A(object):
s = 5
def __init__(self):
self._p = 8
#property
def foo(self):
return range(10)
print ppretty(A(), indent=' ', depth=2, width=30, seq_length=6,
show_protected=True, show_private=False, show_static=True,
show_properties=True, show_address=True)
Output:
__main__.A at 0x1debd68L (
_p = 8,
foo = [0, 1, 2, ..., 7, 8, 9],
s = 5
)
While pprint has been mentioned already by others I'd like to add some context.
The pprint module provides a capability to “pretty-print” arbitrary
Python data structures in a form which can be used as input to the
interpreter. If the formatted structures include objects which are not
fundamental Python types, the representation may not be loadable. This
may be the case if objects such as files, sockets, classes, or
instances are included, as well as many other built-in objects which
are not representable as Python constants.
pprint might be in high-demand by developers with a PHP background who are looking for an alternative to var_dump().
Objects with a dict attribute can be dumped nicely using pprint() mixed with vars(), which returns the __dict__ attribute for a module, class, instance, etc.:
from pprint import pprint
pprint(vars(your_object))
So, no need for a loop.
To dump all variables contained in the global or local scope simply use:
pprint(globals())
pprint(locals())
locals() shows variables defined in a function.
It's also useful to access functions with their corresponding name as a string key, among other usages:
locals()['foo']() # foo()
globals()['foo']() # foo()
Similarly, using dir() to see the contents of a module, or the attributes of an object.
And there is still more.
"""Visit http://diveintopython.net/"""
__author__ = "Mark Pilgrim (mark#diveintopython.org)"
def info(object, spacing=10, collapse=1):
"""Print methods and doc strings.
Takes module, class, list, dictionary, or string."""
methodList = [e for e in dir(object) if callable(getattr(object, e))]
processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)
print "\n".join(["%s %s" %
(method.ljust(spacing),
processFunc(str(getattr(object, method).__doc__)))
for method in methodList])
if __name__ == "__main__":
print help.__doc__
Others have already mentioned the dir() built-in which sounds like what you're looking for, but here's another good tip. Many libraries -- including most of the standard library -- are distributed in source form. Meaning you can pretty easily read the source code directly. The trick is in finding it; for example:
>>> import string
>>> string.__file__
'/usr/lib/python2.5/string.pyc'
The *.pyc file is compiled, so remove the trailing 'c' and open up the uncompiled *.py file in your favorite editor or file viewer:
/usr/lib/python2.5/string.py
I've found this incredibly useful for discovering things like which exceptions are raised from a given API. This kind of detail is rarely well-documented in the Python world.
There is a very cool tool called objexplore. Here is a simple example on how to use its explore function on a pandas DataFrame.
import pandas as pd
df=pd.read_csv('https://storage.googleapis.com/download.tensorflow.org/data/heart.csv')
from objexplore import explore
explore(df)
Will pop up the following in your shell:
Two great tools for inspecting code are:
IPython. A python terminal that allows you to inspect using tab completion.
Eclipse with the PyDev plugin. It has an excellent debugger that allows you to break at a given spot and inspect objects by browsing all variables as a tree. You can even use the embedded terminal to try code at that spot or type the object and press '.' to have it give code hints for you.
If you want to look at parameters and methods, as others have pointed out you may well use pprint or dir()
If you want to see the actual value of the contents, you can do
object.__dict__
pprint and dir together work great
There is a python code library build just for this purpose: inspect Introduced in Python 2.7
If you are interested to see the source code of the function corresponding to the object myobj, you can type in iPython or Jupyter Notebook:
myobj??
In Python 3.8, you can print out the contents of an object by using the __dict__. For example,
class Person():
pass
person = Person()
## set attributes
person.first = 'Oyinda'
person.last = 'David'
## to see the content of the object
print(person.__dict__)
{"first": "Oyinda", "last": "David"}
If you are looking for a slightly more delicate solution, you could try objprint. A positive side of it is that it can handle nested objects. For example:
from objprint import objprint
class Position:
def __init__(self, x, y):
self.x = x
self.y = y
class Player:
def __init__(self):
self.name = "Alice"
self.age = 18
self.items = ["axe", "armor"]
self.coins = {"gold": 1, "silver": 33, "bronze": 57}
self.position = Position(3, 5)
objprint(Player())
Will print out
<Player
.name = 'Alice',
.age = 18,
.items = ['axe', 'armor'],
.coins = {'gold': 1, 'silver': 33, 'bronze': 57},
.position = <Position
.x = 3,
.y = 5
>
>
import pprint
pprint.pprint(obj.__dict__)
or
pprint.pprint(vars(obj))
If you want to look inside a live object, then python's inspect module is a good answer. In general, it works for getting the source code of functions that are defined in a source file somewhere on disk. If you want to get the source of live functions and lambdas that were defined in the interpreter, you can use dill.source.getsource from dill. It also can get the code for from bound or unbound class methods and functions defined in curries... however, you might not be able to compile that code without the enclosing object's code.
>>> from dill.source import getsource
>>>
>>> def add(x,y):
... return x+y
...
>>> squared = lambda x:x**2
>>>
>>> print getsource(add)
def add(x,y):
return x+y
>>> print getsource(squared)
squared = lambda x:x**2
>>>
>>> class Foo(object):
... def bar(self, x):
... return x*x+x
...
>>> f = Foo()
>>>
>>> print getsource(f.bar)
def bar(self, x):
return x*x+x
>>>
vars(obj) returns the attributes of an object.
In addition if you want to look inside list and dictionaries, you can use pprint()
Many good tipps already, but the shortest and easiest (not necessarily the best) has yet to be mentioned:
object?
Try using:
print(object.stringify())
where object is the variable name of the object you are trying to inspect.
This prints out a nicely formatted and tabbed output showing all the hierarchy of keys and values in the object.
NOTE: This works in python3. Not sure if it works in earlier versions
UPDATE: This doesn't work on all types of objects. If you encounter one of those types (like a Request object), use one of the following instead:
dir(object())
or
import pprint
then:
pprint.pprint(object.__dict__)