String representation of Python modules [duplicate] - python

Can a python module have a __repr__? The idea would be to do something like:
import mymodule
print mymodule
EDIT: precision: I mean a user-defined repr!

Short answer: basically the answer is no.
But can't you find the functionality you are looking for using docstrings?
testmodule.py
""" my module test does x and y
"""
class myclass(object):
...
test.py
import testmodule
print testmodule.__doc__
Long answer:
You can define your own __repr__ on a module level (just provide def __repr__(...) but then you'd have to do:
import mymodule
print mymodule.__repr__()
to get the functionality you want.
Have a look at the following python shell session:
>>> import sys # we import the module
>>> sys.__repr__() # works as usual
"<module 'sys' (built-in)>"
>>> sys.__dict__['__repr__'] # but it's not in the modules __dict__ ?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: '__repr__'
>>> sys.__class__.__dict__['__repr__'] # __repr__ is provided on the module type as a slot wrapper
<slot wrapper '__repr__' of 'module' objects>
>>> sys.__class__.__dict__['__repr__'](sys) # which we should feed an instance of the module type
"<module 'sys' (built-in)>"
So I believe the problem lies within these slot wrapper objects which (from what can be read at the link) have the result of bypassing the usual 'python' way of looking up item attributes.
For these class methods CPython returns C pointers to the corresponding methods on these objects (which then get wrapped in the slot wrapper objects to be callable from the python-side).

You can achieve this effect--if you're willing to turn to the Dark Side of the Force.
Add this to mymodule.py:
import sys
class MyReprModule(mymodule.__class__):
def __init__(self, other):
for attr in dir(other):
setattr(self, attr, getattr(other, attr))
def __repr__(self):
return 'ABCDEFGHIJKLMNOQ'
# THIS LINE MUST BE THE LAST LINE IN YOUR MODULE
sys.modules[__name__] = MyReprModule(sys.modules[__name__])
Lo and behold:
>>> import mymodule
>>> print mymodule
ABCDEFGHIJKLMNOQ
I dimly remember, in previous attempts at similarly evil hacks, having trouble setting special attributes like __class__. I didn't have that trouble when testing this. If you run into that problem, just catch the exception and skip that attribute.

Modules can have a __repr__ function, but it isn't invoked when getting the representation of a module.
So no, you can't do what you want.

As a matter of fact, many modules do [have a __repr__]!
>>> import sys
>>> print(sys)
<module 'sys' (built-in)> #read edit, however, this info didn't come from __repr__ !
also try dir(sys) to see __repr__ is there along with __name__ etc..
Edit:
__repr__ seems to be found in modules, in Python 3.0 and up.
As indicated by Ned Batchelder, this methods is not used by Python when it print out the a module. (A quick experiment, where the repr property was re-assigned showed that...)

No, because __repr__ is a special method (I call it a capability), and it is only ever looked up on the class. Your module is just another instance of the module type, so however you would manage to define a __repr__, it would not be called!

Related

Import modules that don't exist (yet)

I wish to create my own variation of amoffat'ssh module, where it can import pretty much any command from user's UNIX path, such as:
from sh import hg
However, I am having a hard time finding a way to intercept / override python's own import [...] and from [...] import [...]. At this point I simply need a way to at least get [the name of] the object of the from import, at which point I can simply setattr() and partial() my way from there, I hope. I'm at a complete loss of how to do this at the moment, however, and hence, have no code to show for it.
The gist of what I'm going for:
from test import t # Even though "t" doesn't exist in the module (yet)
Any help with the full code would be greatly appreciated!
Final Answer, consolidated:
def __getattr__(name):
if name == '__path__': raise AttributeError
print(name)
There is actually a straightforward way if you are on Python 3.7+, PEP-562, which allows you to define __getattr__ at the module level:
def __getattr__(name):
if name == "t":
return "magic"
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
There is also a function __dir__ that you can define to declare what the builtin dir() will say about names in your module.
What sh does is more sophisticated, as they want to support versions below 3.7: Modifying sys.modules and replacing the module with a special object that pretends to be a module.
As #L3viathan pointed out, this is easy starting with Python 3.7: just define a __getattr__ function in your special module. So, for example, you could create an "echo" module (just returns the name of the object you requested) like this:
echo.py (Python >=3.7)
def __getattr__(name):
return name
Then you could use it like this:
from echo import x
print(repr(x))
# 'x'
On earlier versions of Python, you have to subclass the module, as hinted in PEP-562. This also works in Python 3.7.
echo.py (Python >=2)
import sys, types
class EchoModule(types.ModuleType):
def __getattr__(self, name):
return name
sys.modules[__name__] = EchoModule(__name__)
You would use this the same way as the 3.7 version: from echo import something.
Update
For some reason Python tries to retrieve the attribute twice for each from echo import <x> call. It also calls __getattr__('__path__') when the module is loaded. You can avoid side effects in these cases with the following code:
echo.py (only define attributes once)
import sys, types
class EchoModule(types.ModuleType):
def __getattr__(self, name):
# don't define __path__ attribute
if name == '__path__':
raise AttributeError
print("importing {}".format(name))
# create the attribute in case it's required again
setattr(self, name, name)
# return the new attribute
return getattr(self, name)
sys.modules[__name__] = EchoModule(__name__)
This code creates an attribute in the echo module each time a previously unused attribute is imported (sort of like collections.defaultdict). Then, if Python tries to import that same attribute again later, it will pull it directly from the module instead of calling __getattr__ (this is normal behavior for object attributes).
There is also some code here to avoid setting a spurious __path__ attribute; this also avoids running your code when __path__ is requested. Note that this may actually be the most important part; when I tested, just raising AttributeError for __path__ was enough to prevent the double-access to the named attribute.

Altering traceback of a non-callable module

I'm a minor contributor to a package where people are meant to do this (Foo.Bar.Bar is a class):
>>> from Foo.Bar import Bar
>>> s = Bar('a')
Sometimes people do this by mistake (Foo.Bar is a module):
>>> from Foo import Bar
>>> s = Bar('a')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'module' object is not callable
This might seems simple, but users still fail to debug it, I would like to make it easier. I can't change the names of Foo or Bar but I would like to add a more informative traceback like:
TypeError("'module' object is not callable, perhaps you meant to call 'Bar.Bar()'")
I read the Callable modules Q&A, and I know that I can't add a __call__ method to a module (and I don't want to wrap the whole module in a class just for this). Anyway, I don't want the module to be callable, I just want a custom traceback. Is there a clean solution for Python 3.x and 2.7+?
Add this to top of Bar.py: (Based on this question)
import sys
this_module = sys.modules[__name__]
class MyModule(sys.modules[__name__].__class__):
def __call__(self, *a, **k): # module callable
raise TypeError("'module' object is not callable, perhaps you meant to call 'Bar.Bar()'")
def __getattribute__(self, name):
return this_module.__getattribute__(name)
sys.modules[__name__] = MyModule(__name__)
# the rest of file
class Bar:
pass
Note: Tested with python3.6 & python2.7.
What you want is to change the error message when is is displayed to the user. One way to do that is to define your own excepthook.
Your own function could:
search the calling frame in the traceback object (which contains informations about the TypeError exception and the function which does that),
search the Bar object in the local variables,
alter the error message if the object is a module instead of a class or function.
In Foo.__init__.py you can install a your excepthook
import inspect
import sys
def _install_foo_excepthook():
_sys_excepthook = sys.excepthook
def _foo_excepthook(exc_type, exc_value, exc_traceback):
if exc_type is TypeError:
# -- find the last frame (source of the exception)
tb_frame = exc_traceback
while tb_frame.tb_next is not None:
tb_frame = tb_frame.tb_next
# -- search 'Bar' in the local variable
f_locals = tb_frame.tb_frame.f_locals
if 'Bar' in f_locals:
obj = f_locals['Bar']
if inspect.ismodule(obj):
# -- change the error message
exc_value.args = ("'module' object is not callable, perhaps you meant to call 'Foo.Bar.Bar()'",)
_sys_excepthook(exc_type, exc_value, exc_traceback)
sys.excepthook = _foo_excepthook
_install_foo_excepthook()
Of course, you need to enforce this algorithm…
With the following demo:
# coding: utf-8
from Foo import Bar
s = Bar('a')
You get:
Traceback (most recent call last):
File "/path/to/demo_bad.py", line 5, in <module>
s = Bar('a')
TypeError: 'module' object is not callable, perhaps you meant to call 'Foo.Bar.Bar()'
There are a lot of ways you could get a different error message, but they all have weird caveats and side effects.
Replacing the module's __class__ with a types.ModuleType subclass is probably the cleanest option, but it only works on Python 3.5+.
Besides the 3.5+ limitation, the primary weird side effects I've thought of for this option are that the module will be reported callable by the callable function, and that reloading the module will replace its class again unless you're careful to avoid such double-replacement.
Replacing the module object with a different object works on pre-3.5 Python versions, but it's very tricky to get completely right.
Submodules, reloading, global variables, any module functionality besides the custom error message... all of those are likely to break if you miss some subtle aspect of the implementation. Also, the module will be reported callable by callable, just like with the __class__ replacement.
Trying to modify the exception message after the exception is raised, for example in sys.excepthook, is possible, but there isn't a good way to tell that any particular TypeError came from trying to call your module as a function.
Probably the best you could do would be to check for a TypeError with a 'module' object is not callable message in a namespace where it looks plausible that your module would have been called - for example, if the Bar name is bound to the Foo.Bar module in either the frame's locals or globals - but that's still going to have plenty of false negatives and false positives. Also, sys.excepthook replacement isn't compatible with IPython, and whatever mechanism you use would probably conflict with something.
Right now, the problems you have are easy to understand and easy to explain. The problems you would have with any attempt to change the error message are likely to be much harder to understand and harder to explain. It's probably not a worthwhile tradeoff.

Lazy loading python sub-modules, importlib fails first time

I'm experimenting with the idea of lazy-loading of symbols in a package's __init__.py by subclassing ModuleType and defining properties for each of the submodules. Accessing the symbol in the package namespace would trigger the import. I've got it working, but for some reason, my call to import_module fails on the first attempt and I don't understand why.
I have a minimal example. Assume a package like this:
my_package:
__init__.py
m1.py
this is the __init__.py
import sys
import importlib
from types import ModuleType
class MyModule(ModuleType):
#property
def m1(self):
try:
_m1 = importlib.import_module('.m1', __package__)
except AttributeError:
print('second try ...')
_m1 = importlib.import_module('.m1', __package__)
return _m1
old = sys.modules[__name__]
new = MyModule(__name__)
new.__path__ = old.__path__
for k, v in list(old.__dict__.items()):
new.__dict__[k] = v
sys.modules[__name__] = new
The import_module call always fails with an AttributeError: module 'my_package' has no attribute 'm1'. However, the second call always succeeds. In other words, when I do my_package.m1 I always get m1, but it always prints 'second try ...'.
Note, the behavior is dependent on python version. The call to import_lib works fine the first time on python2.7.
Here is the difference between python2 vs python3.
In python3, the importlib.import_module call ultimately ends up
here
which is a call to setattr. Since you didn't define a .setter for
your property, you get the AttributeError.
In python2, the importlib.import_module call ends up
here
which is a call to the builtin __import__ which presumably operates
directly on the module __dict__.
The only question is how in the world it ever works in python3. I
would have thought it would always resulted in a AttributeError.
Your code works fine as long as you make a .setter:
#m1.setter
def m1(self, mod):
self.__dict__['m1'] = mod
It actually turns out that the .setter can do anything at all,
including pass since you are unconditionally making the call to
import_module.
I would consider using the .setattr above and changing the getter to:
#property
def m1(self):
if not self.__dict__.get('m1'):
self.__dict__['m1'] = importlib.import_module('.m1', __package__)
return self.__dict__['m1']

in python, is there a way to find the module that contains a variable or other object from the object itself?

As an example, say I have a variable defined where there may be multiple
from __ import *
from ____ import *
etc.
Is there a way to figure out where one of the variables in the namespace is defined?
edit
Thanks, but I already understand that import * is often considered poor form. That wasn't the question though, and in any case I didn't write it. It'd just be nice to have a way to find where the variable came from.
This is why it is considered bad form to use from __ import * in python in most cases. Either use from __ import myFunc or else import __ as myLib. Then when you need something from myLib it doesn't over lap something else.
For help finding things in the current namespace, check out the pprint library, the dir builtin, the locals builtin, and the globals builtin.
No, the names defined by from blah import * don't retain any information about where they came from. The values might have a clue, for example, classes have a __module__ attribute, but they may have been defined in one module, then imported from another, so you can't count on them being the values you expect.
Sort-of, for example:
>>> from zope.interface.common.idatetime import *
>>> print IDate.__module__
'zope.interface.common.idatetime'
>>> print Attribute.__module__
'zope.interface.interface'
The module of the Attribute may seem surprising since that is not where you imported it from, but it is where the Attribute type was defined. Looking at zope/interface/common/idatetype.py, we see:
from zope.interface import Interface, Attribute
which explains the value of __module__. You'll also run into problems with instances of types imported from other modules. Suppose that you create an Attribute instance named att:
>>> att = Attribute('foo')
>>> print att.__module__
'zope.interface.interface'
Again, you're learning where the type came from, but not where the variable was defined.
Quite possibly the biggest reason to not use wildcard imports is that you don't know what you're getting and they pollute your namespace and possibly clobber other types/variables.
>>> class Attribute(object):
... foo = 9
...
>>> print Attribute.foo
9
>>> from zope.interface.common.idatetime import *
>>> print Attribute.foo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'Attribute' has no attribute 'foo'
Even if today the import * works without collision, there is no guarantee that it won't happen with future updates to the package being imported.
If you call the method itself in the interpreter it will tell you what it's parent modules are.
For example:
>>> from collections import *
>>> deque
<type 'collections.deque'>

How to generate a module object from a code object in Python

Given that I have the code object for a module, how do I get the corresponding module object?
It looks like moduleNames = {}; exec code in moduleNames does something very close to what I want. It returns the globals declared in the module into a dictionary. But if I want the actual module object, how do I get it?
EDIT:
It looks like you can roll your own module object. The module type isn't conveniently documented, but you can do something like this:
import sys
module = sys.__class__
del sys
foo = module('foo', 'Doc string')
foo.__file__ = 'foo.pyc'
exec code in foo.__dict__
As a comment already indicates, in today's Python the preferred way to instantiate types that don't have built-in names is to call the type obtained via the types module from the standard library:
>>> import types
>>> m = types.ModuleType('m', 'The m module')
note that this does not automatically insert the new module in sys.modules:
>>> import sys
>>> sys.modules['m']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'm'
That's a task you must perform by hand:
>>> sys.modules['m'] = m
>>> sys.modules['m']
<module 'm' (built-in)>
This can be important, since a module's code object normally executes after the module's added to sys.modules -- for example, it's perfectly correct for such code to refer to sys.modules[__name__], and that would fail (KeyError) if you forgot this step. After this step, and setting m.__file__ as you already have in your edit,
>>> code = compile("a=23", "m.py", "exec")
>>> exec code in m.__dict__
>>> m.a
23
(or the Python 3 equivalent where exec is a function, if Python 3 is what you're using, of course;-) is correct (of course, you'll normally have obtained the code object by subtler means than compiling a string, but that's not material to your question;-).
In older versions of Python you would have used the new module instead of the types module to make a new module object at the start, but new is deprecated since Python 2.6 and removed in Python 3.

Categories

Resources