Python: Convert string into function name; getattr or equal? - python

I am editing PROSS.py to work with .cif files for protein structures. Inside the existing PROSS.py, there is the following functions (I believe that's the correct name if it's not associated with any class?), just existing within the .py file:
...
def unpack_pdb_line(line, ATOF=_atof, ATOI=_atoi, STRIP=string.strip):
...
...
def read_pdb(f, as_protein=0, as_rna=0, as_dna=0, all_models=0,
unpack=unpack_pdb_line, atom_build=atom_build):
I am adding an optons parser for command line arguments, and one of the options is to specify an alternate method to use besides unpack_pdb_line. So the pertinent part of the options parser is:
...
parser.add_option("--un", dest="unpack_method", default="unpack_pdb_line", type="string", help="Unpack method to use. Default is unpack_pdb_line.")
...
unpack=options.unpack_method
However, options.unpack_method is a string and I need to use the function with the same name as the string inside options.unpack_method. How do I use getattr etc to convert the string into the actual function name?
Thanks,
Paul

Usually you just use a dict and store (func_name, function) pairs:
unpack_options = { 'unpack_pdb_line' : unpack_pdb_line,
'some_other' : some_other_function }
unpack_function = unpack_options[options.unpack_method]

If you want to exploit the dictionaries (&c) that Python's already keeping on your behalf, I'd suggest:
def str2fun(astr):
module, _, function = astr.rpartition('.')
if module:
__import__(module)
mod = sys.modules[module]
else:
mod = sys.modules['__main__'] # or whatever's the "default module"
return getattr(mod, function)
You'll probably want to check the function's signature (and catch exceptions to provide nicer error messages) e.g. via inspect, but this is a useful general-purpose function.
It's easy to add a dictionary of shortcuts, as a fallback, if some known functions full string names (including module/package qualifications) are unwieldy to express this way.
Note we don't use __import__'s result (it doesn't work right when the function is in a module inside some package, as __import__ returns the top-level name of the package... just accessing sys.modules after the import is more practical).

vars()["unpack_pdb_line"]() will work too.
or
globals() or locals() will also work similar way.
>>> def a():return 1
>>>
>>> vars()["a"]
<function a at 0x009D1230>
>>>
>>> vars()["a"]()
1
>>> locals()["a"]()
1
>>> globals()["a"]()
1
Cheers,

If you are taking input from a user, for the sake of security it is probably best to
use a hand-made dict which will accept only a well-defined set of admissible user inputs:
unpack_options = { 'unpack_pdb_line' : unpack_pdb_line,
'unpack_pdb_line2' : unpack_pdb_line2,
}
Ignoring security for a moment, let us note in passing that an easy way to
go from (strings of variable names) to (the value referenced by the variable name)
is to use the globals() builtin dict:
unpack_function=globals()['unpack_pdb_line']
Of course, that will only work if the variable unpack_pdb_line is in the global namespace.
If you need to reach into a packgae for a module, or a module for a variable, then
you could use this function
import sys
def str_to_obj(astr):
print('processing %s'%astr)
try:
return globals()[astr]
except KeyError:
try:
__import__(astr)
mod=sys.modules[astr]
return mod
except ImportError:
module,_,basename=astr.rpartition('.')
if module:
mod=str_to_obj(module)
return getattr(mod,basename)
else:
raise
You could use it like this:
str_to_obj('scipy.stats')
# <module 'scipy.stats' from '/usr/lib/python2.6/dist-packages/scipy/stats/__init__.pyc'>
str_to_obj('scipy.stats.stats')
# <module 'scipy.stats.stats' from '/usr/lib/python2.6/dist-packages/scipy/stats/stats.pyc'>
str_to_obj('scipy.stats.stats.chisquare')
# <function chisquare at 0xa806844>
It works for nested packages, modules, functions, or (global) variables.

function = eval_dottedname(name if '.' in name else "%s.%s" % (__name__, name))
Where eval_dottedname():
def eval_dottedname(dottedname):
"""
>>> eval_dottedname("os.path.join") #doctest: +ELLIPSIS
<function join at 0x...>
>>> eval_dottedname("sys.exit") #doctest: +ELLIPSIS
<built-in function exit>
>>> eval_dottedname("sys") #doctest: +ELLIPSIS
<module 'sys' (built-in)>
"""
return reduce(getattr, dottedname.split(".")[1:],
__import__(dottedname.partition(".")[0]))
eval_dottedname() is the only one among all answers that supports arbitrary names with multiple dots in them e.g., `'datetime.datetime.now'. Though it doesn't work for nested modules that require import, but I can't even remember an example from stdlib for such module.

Related

From within a decorator, how do I get the name of the PARENT MODULE NAME of an arbitrary function I'm wrapping (Python) [duplicate]

In a Python program, if a name exists in the namespace of the program, is it possible to find out if the name is imported from some module, and if yes, which module it is imported from?
You can see which module a function has been defined in via the __module__ attribute. From the Python Data model documentation on __module__:
The name of the module the function was defined in, or None if unavailable.
Example:
>>> from re import compile
>>> compile.__module__
're'
>>> def foo():
... pass
...
>>> foo.__module__
'__main__'
>>>
The Data model later mentions that classes have the same attribute as well:
__module__ is the module name in which the class was defined.
>>> from datetime import datetime
>>> datetime.__module__
'datetime'
>>> class Foo:
... pass
...
>>> Foo.__module__
'__main__'
>>>
You can also do this with builtin names such as int and list. You can accesses them from the builtins module.
>>> int.__module__
'builtins'
>>> list.__module__
'builtins'
>>>
I can use int and list without from builtins import int, list. So how do int and list become available in my program?
That is because int and list are builtin names. You don't have to explicitly import them for Python to be able to find them in the current namespace. You can see this for yourself in the CPython virtual machine source code. As #user2357112 mentioned, builtin names are accessed when global lookup fails. Here's the relevant snippet:
if (v == NULL) {
v = PyDict_GetItem(f->f_globals, name);
Py_XINCREF(v);
if (v == NULL) {
if (PyDict_CheckExact(f->f_builtins)) {
v = PyDict_GetItem(f->f_builtins, name);
if (v == NULL) {
format_exc_check_arg(
PyExc_NameError,
NAME_ERROR_MSG, name);
goto error;
}
Py_INCREF(v);
}
else {
v = PyObject_GetItem(f->f_builtins, name);
if (v == NULL) {
if (PyErr_ExceptionMatches(PyExc_KeyError))
format_exc_check_arg(
PyExc_NameError,
NAME_ERROR_MSG, name);
goto error;
}
}
}
}
In the code above, CPython first searches for a name in the global scope. If that fails, then it falls back and attempts to get the name from a mapping of builtin names in the current frame object its executing. That's what f->f_builtins is.
You can observe this mapping from the Python level using sys._getframe():
>>> import sys
>>> frame = sys._getframe()
>>>
>>> frame.f_builtins['int']
<class 'int'>
>>> frame.f_builtins['list']
<class 'list'>
>>>
sys._getframe() returns the frame at the top of the call stack. In this case, its the frame for the module scope. And as you can see from above, the f_builtins mapping for the frame contains both the int and list classes, so Python has automatically made those names available to you. In other words, it's "built" them into the scope; hence the term "builtins".
If for some reason the source is unavailable, you could use getmodule from inspect which tries its best to find the module by grabbing __module__ if it exists and then falling back to other alternatives.
If everything goes o.k, you get back a module object. From that, you can grab the __name__ to get the actual name of the module:
from inspect import getmodule
from collections import OrderedDict
from math import sin
getmodule(getmodule).__name__
'inspect'
getmodule(OrderedDict).__name__
'collections'
getmodule(sin).__name__
'math'
If it doesn't find anything, it returns None so you'd have to special case this. In general this encapsulates the logic for you so you don't need to write a function yourself to actually grab __module__ if it exists.
This doesn't work for objects that don't have this information attached. You can, as a fall-back, try and pass in the type to circumvent it:
o = OrderedDict()
getmodule(o) # None
getmodule(type(0)).__name__ # 'collections'
but that won't always yield the correct result:
from math import pi
getmodule(type(pi)).__name__
'builtins'
Some objects (but far from all) have an attribute __module__.
Unless code is doing something unusual like updating globals directly, the source code should indicate where every variable came from:
x = 10 # Assigned in the current module
from random import randrange # Loaded from random
from functools import * # Loads everything listed in functools.__all__
You can also look at globals(), it will output a dict with all the names python uses BUT also the variable, modules and functions you declared inside the namespace.
>>> x = 10
>>> import os
>>> globals() # to big to display here but finish with
# {... 'x': 10, 'os': <module 'os' from '/usr/local/lib/python2.7/os.py'>}
Therefore, you can test if variable were declared like this:
if globals()['x']:
print('x exist')
try:
print(globals()['y'])
except KeyError:
print('but y does not')
# x exist
# but y does not
Works also with modules:
print(globals()['os']) # <module 'os' from '/usr/local/lib/python2.7/os.py'>
try:
print(globals()['math'])
except KeyError:
print('but math is not imported')
# <module 'os' from '/usr/local/lib/python2.7/os.py'>
# but math is not imported

How to set a variable by its name in Python?

EDIT 2 : since so many people are crying against the bad design this usecase can reveal. Readers of these question and answers should think twice before using it
I've trying to set a variable (not property) by it's name in Python :
foo = 'bar'
thefunctionimlookingfor('foo', 'baz')
print foot #should print baz
PS : the function to access a variable by its name (without eval) would be a plus !
EDIT : I do know dictionary exists, this kind of usage is discouraged, I've choose to use it for a very specific purpose (config file modification according to environment), that will let my code easier to read.
When you want variably-named variables, it's time to use a dictionary:
data = {}
foo = 'bar'
data[foo] = 'baz'
print data['bar']
Dynamically setting variables in the local scope is not possible in Python 2.x without using exec, and not possible at all in Python 3.x. You can change the global scope by modifying the dictionary returned by globals(), but you actually shouldn't. Simply use your own dictionary instead.
You can do something like:
def thefunctionimlookingfor(a, b):
globals()[a] = b
Usage:
>>> foo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'q' is not defined
>>> thefunctionimlookingfor('foo', 'bar')
>>> foo
'bar'
But this is a terrible idea, as others have mentioned. Namespaces are a useful concept. Consider a redesign.
At the module level you can use setattr on the current module, which you can get from sys.modules:
setattr(sys.modules[__name__], 'name', 'value')
The locals() function returns a dictionary filled with the local variables.
locals()['foo'] = 'baz'
Are you looking for functions like these? They allow modifying the local namespace you happen to be in.
import sys
def get_var(name):
return sys._getframe(1).f_locals[name]
def set_var(name, value):
sys._getframe(1).f_locals[name] = value
def del_var(name):
del sys._getframe(1).f_locals[name]

How to use the __import__ function to import a name from a submodule?

I'm trying to replicate from foo.bar import object using the __import__ function and I seem to have hit a wall.
A simpler case from glob import glob is easy: glob = __import__("glob").glob
The problem I'm having is that I am importing a name from a subpackage (i.e. from foo.bar):
So what I'd like is something like
string_to_import = "bar"
object = __import__("foo." + string_to_import).object
But this just imported the top-level foo package, not the foo.bar subpackage:
__import__("foo.bar")
<module 'foo' from 'foo/__init__.pyc'>
How to use python's __import__() function properly?
There are two kinds of uses:
direct importing
a hook to alter import behavior
For the most part, you don't really need to do either.
For user-space importing
Best practice is to use importlib instead. But if you insist:
Trivial usage:
>>> sys = __import__('sys')
>>> sys
<module 'sys' (built-in)>
Complicated:
>>> os = __import__('os.path')
>>> os
<module 'os' from '/home/myuser/anaconda3/lib/python3.6/os.py'>
>>> os.path
<module 'posixpath' from '/home/myuser/anaconda3/lib/python3.6/posixpath.py'>
If you want the rightmost child module in the name, pass a nonempty list, e.g. [None], to fromlist:
>>> path = __import__('os.path', fromlist=[None])
>>> path
<module 'posixpath' from '/home/myuser/anaconda3/lib/python3.6/posixpath.py'>
Or, as the documentation declares, use importlib.import_module:
>>> importlib = __import__('importlib')
>>> futures = importlib.import_module('concurrent.futures')
>>> futures
<module 'concurrent.futures' from '/home/myuser/anaconda3/lib/python3.6/concurrent/futures/__init__.py'>
Documentation
The docs for __import__ are the most confusing of the builtin functions.
__import__(...)
__import__(name, globals=None, locals=None, fromlist=(), level=0) -> module
Import a module. Because this function is meant for use by the Python
interpreter and not for general use it is better to use
importlib.import_module() to programmatically import a module.
The globals argument is only used to determine the context;
they are not modified. The locals argument is unused. The fromlist
should be a list of names to emulate ``from name import ...'', or an
empty list to emulate ``import name''.
When importing a module from a package, note that __import__('A.B', ...)
returns package A when fromlist is empty, but its submodule B when
fromlist is not empty. Level is used to determine whether to perform
absolute or relative imports. 0 is absolute while a positive number
is the number of parent directories to search relative to the current module.
If you read it carefully, you get the sense that the API was originally intended to allow for lazy-loading of functions from modules. However, this is not how CPython works, and I am unaware if any other implementations of Python have managed to do this.
Instead, CPython executes all of the code in the module's namespace on its first import, after which the module is cached in sys.modules.
__import__ can still be useful. But understanding what it does based on the documentation is rather hard.
Full Usage of __import__
To adapt the full functionality to demonstrate the current __import__ API, here is a wrapper function with a cleaner, better documented, API.
def importer(name, root_package=False, relative_globals=None, level=0):
""" We only import modules, functions can be looked up on the module.
Usage:
from foo.bar import baz
>>> baz = importer('foo.bar.baz')
import foo.bar.baz
>>> foo = importer('foo.bar.baz', root_package=True)
>>> foo.bar.baz
from .. import baz (level = number of dots)
>>> baz = importer('baz', relative_globals=globals(), level=2)
"""
return __import__(name, locals=None, # locals has no use
globals=relative_globals,
fromlist=[] if root_package else [None],
level=level)
To demonstrate, e.g. from a sister package to baz:
baz = importer('foo.bar.baz')
foo = importer('foo.bar.baz', root_package=True)
baz2 = importer('bar.baz', relative_globals=globals(), level=2)
assert foo.bar.baz is baz is baz2
Dynamic access of names in the module
To dynamically access globals by name from the baz module, use getattr. For example:
for name in dir(baz):
print(getattr(baz, name))
Hook to alter import behavior
You can use __import__ to alter or intercept importing behavior. In this case, let's just print the arguments it gets to demonstrate we're intercepting it:
old_import = __import__
def noisy_importer(name, locals, globals, fromlist, level):
print(f'name: {name!r}')
print(f'fromlist: {fromlist}')
print(f'level: {level}')
return old_import(name, locals, globals, fromlist, level)
import builtins
builtins.__import__ = noisy_importer
And now when you import you can see these important arguments.
>>> from os.path import join as opj
name: 'os.path'
fromlist: ('join',)
level: 0
>>> opj
<function join at 0x7fd08d882618>
Perhaps in this context getting the globals or locals could be useful, but no specific uses for this immediately come to mind.
The __import__ function will return the top level module of a package, unless you pass a nonempty fromlist argument:
_temp = __import__('foo.bar', fromlist=['object'])
object = _temp.object
See the Python docs on the __import__ function.
You should use importlib.import_module, __import__ is not advised outside the interpreter.
In __import__'s docstring:
Import a module. Because this function is meant for use by the Python
interpreter and not for general use it is better to use
importlib.import_module() to programmatically import a module.
It also supports relative imports.
Rather than use the __import__ function I would use the getattr function:
model = getattr(module, model_s)
where module is the module to look in and and model_s is your model string. The __import__ function is not meant to be used loosely, where as this function will get you what you want.
In addition to these excellent answers, I use __import__ for convenience, to call an one-liner on the fly. Examples like the following can also be saved as auto-triggered snippets at your IDE.
Plant an ipdb break-point (triggered with "ipdb")
__import__("ipdb").set_trace(context=9)
Print prettily (triggered with "pp")
__import__("pprint").pprint(<cursor-position>)
This way, you get a temporary object, that is not referenced by anything, and call an attribute on the spot. Also, you can easily comment, uncomment or delete a single line.

Importing a python module into a dict (for use as globals in execfile())?

I'm using the Python execfile() function as a simple-but-flexible way of handling configuration files -- basically, the idea is:
# Evaluate the 'filename' file into the dictionary 'foo'.
foo = {}
execfile(filename, foo)
# Process all 'Bar' items in the dictionary.
for item in foo:
if isinstance(item, Bar):
# process item
This requires that my configuration file has access to the definition of the Bar class. In this simple example, that's trivial; we can just define foo = {'Bar' : Bar} rather than an empty dict. However, in the real example, I have an entire module I want to load. One obvious syntax for that is:
foo = {}
eval('from BarModule import *', foo)
execfile(filename, foo)
However, I've already imported BarModule in my top-level file, so it seems like I should be able to just directly define foo as the set of things defined by BarModule, without having to go through this chain of eval and import.
Is there a simple idiomatic way to do that?
Maybe you can use the __dict__ defined by the module.
>>> import os
>>> str = 'getcwd()'
>>> eval(str,os.__dict__)
Use the builtin vars() function to get the attributes of an object (such as a module) as a dict.
The typical solution is to use getattr:
>>> s = 'getcwd'
>>> getattr(os, s)()

Does Python have a method that returns all the attributes in a module?

I already search for it on Google but I didn't have luck.
in addition to the dir builtin that has been mentioned, there is the inspect module which has a really nice getmembers method. Combined with pprint.pprint you have a powerful combo
from pprint import pprint
from inspect import getmembers
import linecache
pprint(getmembers(linecache))
some sample output:
('__file__', '/usr/lib/python2.6/linecache.pyc'),
('__name__', 'linecache'),
('__package__', None),
('cache', {}),
('checkcache', <function checkcache at 0xb77a7294>),
('clearcache', <function clearcache at 0xb77a7224>),
('getline', <function getline at 0xb77a71ec>),
('getlines', <function getlines at 0xb77a725c>),
('os', <module 'os' from '/usr/lib/python2.6/os.pyc'>),
('sys', <module 'sys' (built-in)>),
('updatecache', <function updatecache at 0xb77a72cc>)
note that unlike dir you get to see that actual values of the members. You can apply filters to getmembers that are similar to the onese that you can apply to dir, they can just be more powerful. For example,
def get_with_attribute(mod, attribute, public=True):
items = getmembers(mod)
if public:
items = filter(lambda item: item[0].startswith('_'), items)
return [attr for attr, value in items if hasattr(value, attribute]
import module
dir(module)
You're looking for dir:
import os
dir(os)
??dir
dir([object]) -> list of strings
If called without an argument, return the names in the current scope.
Else, return an alphabetized list of names comprising (some of) the attributes
of the given object, and of attributes reachable from it.
If the object supplies a method named __dir__, it will be used; otherwise
the default dir() logic is used and returns:
for a module object: the module's attributes.
for a class object: its attributes, and recursively the attributes
of its bases.
for any other object: its attributes, its class's attributes, and
recursively the attributes of its class's base classes.
As it has been correctly pointed out, the dir function will return a list with all the available methods in a given object.
If you call dir() from the command prompt, it will respond with the methods available upon start. If you call:
import module
print dir(module)
it will print a list with all the available methods in module module. Most times you are interested only in public methods (those that you are supposed to be using) - by convention, Python private methods and variables start with __, so what I do is the following:
import module
for method in dir(module):
if not method.startswith('_'):
print method
That way you only print public methods (to be sure - the _ is only a convention and many module authors may fail to follow the convention)
dir is what you need :)

Categories

Resources