Import modules that don't exist (yet) - python

I wish to create my own variation of amoffat'ssh module, where it can import pretty much any command from user's UNIX path, such as:
from sh import hg
However, I am having a hard time finding a way to intercept / override python's own import [...] and from [...] import [...]. At this point I simply need a way to at least get [the name of] the object of the from import, at which point I can simply setattr() and partial() my way from there, I hope. I'm at a complete loss of how to do this at the moment, however, and hence, have no code to show for it.
The gist of what I'm going for:
from test import t # Even though "t" doesn't exist in the module (yet)
Any help with the full code would be greatly appreciated!
Final Answer, consolidated:
def __getattr__(name):
if name == '__path__': raise AttributeError
print(name)

There is actually a straightforward way if you are on Python 3.7+, PEP-562, which allows you to define __getattr__ at the module level:
def __getattr__(name):
if name == "t":
return "magic"
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
There is also a function __dir__ that you can define to declare what the builtin dir() will say about names in your module.
What sh does is more sophisticated, as they want to support versions below 3.7: Modifying sys.modules and replacing the module with a special object that pretends to be a module.

As #L3viathan pointed out, this is easy starting with Python 3.7: just define a __getattr__ function in your special module. So, for example, you could create an "echo" module (just returns the name of the object you requested) like this:
echo.py (Python >=3.7)
def __getattr__(name):
return name
Then you could use it like this:
from echo import x
print(repr(x))
# 'x'
On earlier versions of Python, you have to subclass the module, as hinted in PEP-562. This also works in Python 3.7.
echo.py (Python >=2)
import sys, types
class EchoModule(types.ModuleType):
def __getattr__(self, name):
return name
sys.modules[__name__] = EchoModule(__name__)
You would use this the same way as the 3.7 version: from echo import something.
Update
For some reason Python tries to retrieve the attribute twice for each from echo import <x> call. It also calls __getattr__('__path__') when the module is loaded. You can avoid side effects in these cases with the following code:
echo.py (only define attributes once)
import sys, types
class EchoModule(types.ModuleType):
def __getattr__(self, name):
# don't define __path__ attribute
if name == '__path__':
raise AttributeError
print("importing {}".format(name))
# create the attribute in case it's required again
setattr(self, name, name)
# return the new attribute
return getattr(self, name)
sys.modules[__name__] = EchoModule(__name__)
This code creates an attribute in the echo module each time a previously unused attribute is imported (sort of like collections.defaultdict). Then, if Python tries to import that same attribute again later, it will pull it directly from the module instead of calling __getattr__ (this is normal behavior for object attributes).
There is also some code here to avoid setting a spurious __path__ attribute; this also avoids running your code when __path__ is requested. Note that this may actually be the most important part; when I tested, just raising AttributeError for __path__ was enough to prevent the double-access to the named attribute.

Related

Call Python function using dynamic string variables

I am trying to create a dynamic method executor, where I have a list that will always contain two elements. The first element is the name of the file, the second element is the name of the method to execute.
How can I achieve this?
My below code unfortunately doesn't work, but it will give you an good indication of what I am trying to achieve.
from logic.intents import CenterCapacity
def method_executor(event):
call_reference = ['CenterCapacity', 'get_capacity']
# process method call
return call_reference[0].call_reference[1]
Thanks!
You can use __import__ to look up the module by name and then then use getattr to find the method. For example if the following code is in a file called exec.py then
def dummy(): print("dummy")
def lookup(mod, func):
module = __import__(mod)
return getattr(module, func)
if __name__ == "__main__":
lookup("exec","dummy")()
will output
dummy
Addendum
Alternatively importlib.import_module can be used, which although a bit more verbose, may be easier to use.
The most important difference between these two functions is that import_module() returns the specified package or module (e.g. pkg.mod), while __import__() returns the top-level package or module (e.g. pkg).
def lookup(mod, func):
import importlib
module = importlib.import_module(mod)
return getattr(module, func)
starting from:
from logic.intents import CenterCapacity
def method_executor(event):
call_reference = ['CenterCapacity', 'get_capacity']
# process method call
return call_reference[0].call_reference[1]
Option 1
We have several options, the first one is using a class reference and the getattr. For this we have to remove the ' around the class and instantiate the class before calling a reference (you do not have to instantiate the class when the method is a staticmethod.)
def method_executor(event):
call_reference = [CenterCapacity, 'get_capacity'] # We now store a class reference
# process method call
return getattr(call_reference[0](), call_reference[1])
option 2
A second option is based on this answer. It revolves around using the getattr method twice. We firstly get module using sys.modules[__name__] and then get the class from there using getattr.
import sys
def method_executor(event):
call_reference = ['CenterCapacity', 'get_capacity']
class_ref = getattr(sys.modules[__name__], call_reference[0])
return getattr(class_ref, call_reference[1])
Option 3
A third option could be based on a full import path and use __import__('module.class'), take a look at this SO post.
(Note: This answer assumes that the necessary imports have already happened, and you just need a mechanism to invoke the functions of the imported modules. If you also want the import do be done by some program code, I will have to add that part, using importlib library)
You can do this:
globals()[call_reference[0]].__dict__[call_reference[1]]()
Explanation:
globals() returns a mapping between global variable names and their referenced objects. The imported module's name counts as one of these global variables of the current module.
Indexing this mapping object with call_reference[0] returns the module object containing the function to be called.
The module object's __dict__ maps each attribute-name of the module to the object referenced by that attribute. Functions defined in the module also count as attributes of the module.
Thus, indexing __dict__ with the function name call_reference[1] returns the function object.

Given an imported module, how can I determine the import path?

I have a python function that takes a imported module as a parameter:
def printModule(module):
print("That module is named '%s'" % magic(module))
import foo.bar.baz
printModule(foo.bar.baz)
What I want is to be able to extract the module name (in this case foo.bar.baz) from a passed reference to the module. In the above example, the magic() function is a stand-in for the function I want.
__name__ would normally work, but that requires executing in the context of the passed module, and I'm trying to extract this information from merely a reference to the module.
What is the proper procedure here?
All I can think of is either doing string(module), and then some text hacking, or trying to inject a function into the module that I can then call to have return __name__, and neither of those solutions is elegant. (I tried both these, neither actually work. Modules can apparently permute their name in the string() representation, and injecting a function into the module and then calling it just returns the caller's context.)
The __name__ attribute seems to work:
def magic(m):
return m.__name__
If you have a string with the module name, you can use pkgutil.
import pkgutil
pkg = pkgutil.get_loader(module_name)
print pkg.fullname
From the module itself,
import pkgutil
pkg = pkgutil.get_loader(module.__name__)
print pkg.fullname

python get module variable by name

I'm trying to find a way to access module variable by name, but haven't found anything yet. The thing I'm using now is:
var = eval('myModule.%s' % (variableName))
but it's fuzzy and breaks IDE error checking (i.e. in eclipse/pydev import myModule is marked as unused, while it's needed for above line). Is there any better way to do it? Possibly a module built-in function I don't know?
import mymodule
var = getattr(mymodule, variablename)
getattr(themodule, "attribute_name", None)
The third argument is the default value if the attribute does not exist.
From https://docs.python.org/2/library/functions.html#getattr
Return the value of the named attribute of object. name must be a string. If the string is the name of one of the object’s attributes, the result is the value of that attribute. For example, getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise AttributeError is raised.
If you want to get a variable from the current module you're running in (and not another one):
import sys
# sys.modules[__name__] is the instance of the current module
var = getattr(sys.modules[__name__], 'var_name')
var can be of course a regualr variable, class or you anything else basically :)
Another option, is to use INSPECT, and the getmembers(modulename) function.
It will return a complete list of what is in the module, which then can be cached. eg.
>>>cache = dict(inspect.getmembers(module))
>>>cache["__name__"]
Pyfile1 test
>>>cache["__email__"]
'name#email.com'
>>>cache["test"]("abcdef")
test, abcdef
The advantage here is that you are only performing the look up once, and it assumes that the module is not changing during the program execution.

How do you get all classes defined in a module but not imported?

I've already seen the following question but it doesn't quite get me where I want: How can I get a list of all classes within current module in Python?
In particular, I do not want classes that are imported, e.g. if I had the following module:
from my.namespace import MyBaseClass
from somewhere.else import SomeOtherClass
class NewClass(MyBaseClass):
pass
class AnotherClass(MyBaseClass):
pass
class YetAnotherClass(MyBaseClass):
pass
If I use clsmembers = inspect.getmembers(sys.modules[__name__], inspect.isclass) like the accepted answer in the linked question suggests, it would return MyBaseClass and SomeOtherClass in addition to the 3 defined in this module.
How can I get only NewClass, AnotherClass and YetAnotherClass?
Inspect the __module__ attribute of the class to find out which module it was defined in.
I apologize for answering such an old question, but I didn't feel comfortable using the inspect module for this solution. I read somewhere that is wasn't safe to use in production.
Initialize all the classes in a module into nameless objects in a list
See Antonis Christofides comment to answer 1.
I got the answer for testing if an object is a class from
How to check whether a variable is a class or not?
So this is my inspect-free solution
def classesinmodule(module):
md = module.__dict__
return [
md[c] for c in md if (
isinstance(md[c], type) and md[c].__module__ == module.__name__
)
]
classesinmodule(modulename)
You may also want to consider using the "Python class browser" module in the standard library:
http://docs.python.org/library/pyclbr.html
Since it doesn't actually execute the module in question (it does naive source inspection instead) there are some specific techniques it doesn't quite understand correctly, but for all "normal" class definitions, it will describe them accurately.
I used the below:
# Predicate to make sure the classes only come from the module in question
def pred(c):
return inspect.isclass(c) and c.__module__ == pred.__module__
# fetch all members of module __name__ matching 'pred'
classes = inspect.getmembers(sys.modules[__name__], pred)
I didn't want to type the current module name in
from pyclbr import readmodule
clsmembers = readmodule(__name__).items()
The Keep it simple solution:
> python.exe -c "import unittest"
> python.exe -c "from .panda_tools import *"
Command line help: -c cmd : program passed in as string (terminates option list)
If Python will attempt to loading, if not found gives these error types:
ModuleNotFoundError: No module named 'unittest'
ImportError: attempted relative import with no known parent package

Lazy module variables--can it be done?

I'm trying to find a way to lazily load a module-level variable.
Specifically, I've written a tiny Python library to talk to iTunes, and I want to have a DOWNLOAD_FOLDER_PATH module variable. Unfortunately, iTunes won't tell you where its download folder is, so I've written a function that grabs the filepath of a few podcast tracks and climbs back up the directory tree until it finds the "Downloads" directory.
This takes a second or two, so I'd like to have it evaluated lazily, rather than at module import time.
Is there any way to lazily assign a module variable when it's first accessed or will I have to rely on a function?
You can't do it with modules, but you can disguise a class "as if" it was a module, e.g., in itun.py, code...:
import sys
class _Sneaky(object):
def __init__(self):
self.download = None
#property
def DOWNLOAD_PATH(self):
if not self.download:
self.download = heavyComputations()
return self.download
def __getattr__(self, name):
return globals()[name]
# other parts of itun that you WANT to code in
# module-ish ways
sys.modules[__name__] = _Sneaky()
Now anybody can import itun... and get in fact your itun._Sneaky() instance. The __getattr__ is there to let you access anything else in itun.py that may be more convenient for you to code as a top-level module object, than inside _Sneaky!_)
It turns out that as of Python 3.7, it's possible to do this cleanly by defining a __getattr__() at the module level, as specified in PEP 562 and documented in the data model chapter in the Python reference documentation.
# mymodule.py
from typing import Any
DOWNLOAD_FOLDER_PATH: str
def _download_folder_path() -> str:
global DOWNLOAD_FOLDER_PATH
DOWNLOAD_FOLDER_PATH = ... # compute however ...
return DOWNLOAD_FOLDER_PATH
def __getattr__(name: str) -> Any:
if name == "DOWNLOAD_FOLDER_PATH":
return _download_folder_path()
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
I used Alex' implementation on Python 3.3, but this crashes miserably:
The code
def __getattr__(self, name):
return globals()[name]
is not correct because an AttributeError should be raised, not a KeyError.
This crashed immediately under Python 3.3, because a lot of introspection is done
during the import, looking for attributes like __path__, __loader__ etc.
Here is the version that we use now in our project to allow for lazy imports
in a module. The __init__ of the module is delayed until the first attribute access
that has not a special name:
""" config.py """
# lazy initialization of this module to avoid circular import.
# the trick is to replace this module by an instance!
# modelled after a post from Alex Martelli :-)
Lazy module variables--can it be done?
class _Sneaky(object):
def __init__(self, name):
self.module = sys.modules[name]
sys.modules[name] = self
self.initializing = True
def __getattr__(self, name):
# call module.__init__ after import introspection is done
if self.initializing and not name[:2] == '__' == name[-2:]:
self.initializing = False
__init__(self.module)
return getattr(self.module, name)
_Sneaky(__name__)
The module now needs to define an init function. This function can be used
to import modules that might import ourselves:
def __init__(module):
...
# do something that imports config.py again
...
The code can be put into another module, and it can be extended with properties
as in the examples above.
Maybe that is useful for somebody.
For Python 3.5 and 3.6, the proper way of doing this, according to the Python docs, is to subclass types.ModuleType and then dynamically update the module's __class__. So, here's a solution loosely on Christian Tismer's answer but probably not resembling it much at all:
import sys
import types
class _Sneaky(types.ModuleType):
#property
def DOWNLOAD_FOLDER_PATH(self):
if not hasattr(self, '_download_folder_path'):
self._download_folder_path = '/dev/block/'
return self._download_folder_path
sys.modules[__name__].__class__ = _Sneaky
For Python 3.7 and later, you can define a module-level __getattr__() function. See PEP 562 for details.
Since Python 3.7 (and as a result of PEP-562), this is now possible with the module-level __getattr__:
Inside your module, put something like:
def _long_function():
# print() function to show this is called only once
print("Determining DOWNLOAD_FOLDER_PATH...")
# Determine the module-level variable
path = "/some/path/here"
# Set the global (module scope)
globals()['DOWNLOAD_FOLDER_PATH'] = path
# ... and return it
return path
def __getattr__(name):
if name == "DOWNLOAD_FOLDER_PATH":
return _long_function()
# Implicit else
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
From this it should be clear that the _long_function() isn't executed when you import your module, e.g.:
print("-- before import --")
import somemodule
print("-- after import --")
results in just:
-- before import --
-- after import --
But when you attempt to access the name from the module, the module-level __getattr__ will be called, which in turn will call _long_function, which will perform the long-running task, cache it as a module-level variable, and return the result back to the code that called it.
For example, with the first block above inside the module "somemodule.py", the following code:
import somemodule
print("--")
print(somemodule.DOWNLOAD_FOLDER_PATH)
print('--')
print(somemodule.DOWNLOAD_FOLDER_PATH)
print('--')
produces:
--
Determining DOWNLOAD_FOLDER_PATH...
/some/path/here
--
/some/path/here
--
or, more clearly:
# LINE OF CODE # OUTPUT
import somemodule # (nothing)
print("--") # --
print(somemodule.DOWNLOAD_FOLDER_PATH) # Determining DOWNLOAD_FOLDER_PATH...
# /some/path/here
print("--") # --
print(somemodule.DOWNLOAD_FOLDER_PATH) # /some/path/here
print("--") # --
Lastly, you can also implement __dir__ as the PEP describes if you want to indicate (e.g. to code introspection tools) that DOWNLOAD_FOLDER_PATH is available.
Is there any way to lazily assign a module variable when it's first accessed or will I have to rely on a function?
I think you are correct in saying that a function is the best solution to your problem here.
I will give you a brief example to illustrate.
#myfile.py - an example module with some expensive module level code.
import os
# expensive operation to crawl up in directory structure
The expensive operation will be executed on import if it is at module level. There is not a way to stop this, short of lazily importing the entire module!!
#myfile2.py - a module with expensive code placed inside a function.
import os
def getdownloadsfolder(curdir=None):
"""a function that will search upward from the user's current directory
to find the 'Downloads' folder."""
# expensive operation now here.
You will be following best practice by using this method.
Recently I came across the same problem, and have found a way to do it.
class LazyObject(object):
def __init__(self):
self.initialized = False
setattr(self, 'data', None)
def init(self, *args):
#print 'initializing'
pass
def __len__(self): return len(self.data)
def __repr__(self): return repr(self.data)
def __getattribute__(self, key):
if object.__getattribute__(self, 'initialized') == False:
object.__getattribute__(self, 'init')(self)
setattr(self, 'initialized', True)
if key == 'data':
return object.__getattribute__(self, 'data')
else:
try:
return object.__getattribute__(self, 'data').__getattribute__(key)
except AttributeError:
return super(LazyObject, self).__getattribute__(key)
With this LazyObject, You can define a init method for the object, and the object will be initialized lazily, example code looks like:
o = LazyObject()
def slow_init(self):
time.sleep(1) # simulate slow initialization
self.data = 'done'
o.init = slow_init
the o object above will have exactly the same methods whatever 'done' object have, for example, you can do:
# o will be initialized, then apply the `len` method
assert len(o) == 4
complete code with tests (works in 2.7) can be found here:
https://gist.github.com/observerss/007fedc5b74c74f3ea08
If that variable lived in a class rather than a module, then you could overload getattr, or better yet, populate it in init.
SPEC 1
Probably the best known recipe for Lazy Loading module attributes (and modules) is in SPEC 1 (Draft) at scientific-python.org. SPECs are operational guidelines for projects in the Scientific Python ecosystem. There is discussion around the SPEC 1 at Scientific Python Discourse and the solution is offered as a package in PyPI as lazy_loader. The lazy_loader implementation relies on the module __gettattr__ support introduced in Python 3.7 (PEP 562), and it is used in scikit-image, NetworkX, and partially in Scipy
Example usage:
The following example is using the lazy_loader PyPI package. You could also just copy-paste the source code to be part of your project.
# mypackage/__init__.py
import lazy_loader
__getattr__, __dir__, __all__ = lazy_loader.attach(
__name__,
submodules=['bar'],
submod_attrs={
'foo.morefoo': ['FooFilter', 'do_foo', 'MODULE_VARIABLE'],
'grok.spam': ['spam_a', 'spam_b', 'spam_c']
}
)
this is the lazy import equivalent of
from . import bar
from .foo.morefoo import FooFilter, do_foo, MODULE_VARIABLE
from .grok.spam import (spam_a, spam_b, spam_c)
Short explanation on lazy_loader.attach
If you want to lazy-load a module, you list it in submodules (which is a list)
If you want to lazy-load something from a module (function, class, etc.), you list it in submod_attrs (which is a dict)
Type checking
Static type checkers and IDEs cannot infer type information from lazily loaded imports. As workaround, you may use type stubs (.pyi files), like this:
# mypackage/__init__.pyi
from .foo.morefoo import FooFilter as FooFilter, do_foo as do_foo, MODULE_VARIABLE as MODULE_VARIABLE
from .grok.spam import spam_a as spam_a, spam_b as spam_b, spam_c as spam_c
The SPEC 1 mentions that this X as X syntax is necessary due to PEP484.
Side notes
There was recently a PEP for Lazy Imports, PEP 690, but it was rejected.
In Tensorflow, there is lazy loading class at util.lazyloader.
There is one blog post from Brett Cannon (a Python core developer), where he showed in 2018 a module __getattr__ based implementation of lazy_loader, and provided it in a package called modutil, but the project is marked archived in GitHub. This has been an inspiration for the scientific-python lazy_loader.

Categories

Resources