Python repr for classes

Python repr for classes - python

As the Python 2 documentation on __repr__ states:
If at all possible, this (i.e. __repr__) should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).
So how come builtin __repr__ for classes does not act accordingly to that guideline?
Example
>>> class A(object):
... pass
>>> repr(A)
"<class 'A'>"
To meet the guideline, the default __repr__ should return "A", i.e. generally A.__name__. Why is it acting differently? It would be extra-easy to implement, I believe.
Edit: The scope of 'reproduction'
I can see in the answers that it is not clear in the discussion what repr should return. The way I see it, the repr function should return a string that allows you to reproduce the object:
in an arbitrary context and
automatically (i.e. not manually).
Ad.1. Take a look at a built-in class case (taken from this SO question):
>>> from datetime import date
>>>
>>> repr(date.today()) # calls date.today().__repr__()
'datetime.date(2009, 1, 16)'
Apparently, the assumed context is as if you use the basic form of import, i.e. import datetime, because if you would try eval(repr(date.today())), datetime would not be recognized. So the point is that __repr__ doesn't need to represent the object from scratch. It's enough if it is unambiguous in a context the community agreed upon, e.g. using direct module's types and functions. Sounds reasonable, right?
Ad.2. Giving an impression of how the object could be reconstructed is not enough for repr, I believe. Helpfulness in debugging is the purpose of str.
Conclusion
So what I expect from repr is allowing me to do eval on the result. And in the case of a class, I would not like to get the whole code that would reconstruct the class from scratch. Instead, I would like to have an unambiguous reference to a class visible in my scope. The "Module.Class" would suffice. No offence, Python, but "<class 'Module.Class'>" doesn't just cut it.

Consider a slightly more complicated class:
class B(object):
def __init__(self):
self.foo=3
repr would need to return something like
type("B", (object,), { "__init__": lambda self: setattr(self, "foo", 3) })
Notice one difficulty already: not all functions defined by the def statement can be translated into a single lambda expression. Change B slightly:
class B(object):
def __init__(self, x=2, y, **kwargs):
print "in B.__init__"
How do you write an expression that defines B.__init__? You can't use
lambda self: print "in B.__init__"
because lambda expressions cannot contain statements. For this simple class, it is already impossible to write a single expression that defines the class completely.

Because the default __repr__ cannot know what statements were used to create the class.
The documentation you quote starts with If at all possible. Since it is not possible to represent custom classes in a way that lets you recreate them, a different format is used, which follows the default for all things not easily recreated.
If repr(A) were to just return 'A', that'd be meaningless. You are not recreating A, you'd just be referencing it then. "type('A', (object,), {})" would be closer to reflecting the class constructor, but that'd be a) confusing for people not familiar with the fact python classes are instances of type and b) never able to reflect methods and attributes accurately.
Compare the output to that of repr(type) or repr(int) instead, these follow the same pattern.

I know this is an older question, but I found a way to do it.
The only way I know to do this is with a metaclass like so:
class A(object):
secret = 'a'
class _metaA(type):
#classmethod
def __repr__(cls):
return "<Repr for A: secret:{}>".format(A.secret)
__metaclass__ =_metaA
outputs:
>>> A
<Repr for A: secret:a>

Since neither "<class 'A'>" nor "A" can be used to re-create the class when its definition is not available, I think the question is moot.

Related

Invisible argument python [duplicate]

This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 months ago.
When defining a method on a class in Python, it looks something like this:
class MyClass(object):
def __init__(self, x, y):
self.x = x
self.y = y
But in some other languages, such as C#, you have a reference to the object that the method is bound to with the "this" keyword without declaring it as an argument in the method prototype.
Was this an intentional language design decision in Python or are there some implementation details that require the passing of "self" as an argument?

I like to quote Peters' Zen of Python. "Explicit is better than implicit."
In Java and C++, 'this.' can be deduced, except when you have variable names that make it impossible to deduce. So you sometimes need it and sometimes don't.
Python elects to make things like this explicit rather than based on a rule.
Additionally, since nothing is implied or assumed, parts of the implementation are exposed. self.__class__, self.__dict__ and other "internal" structures are available in an obvious way.

It's to minimize the difference between methods and functions. It allows you to easily generate methods in metaclasses, or add methods at runtime to pre-existing classes.
e.g.
>>> class C:
... def foo(self):
... print("Hi!")
...
>>>
>>> def bar(self):
... print("Bork bork bork!")
...
>>>
>>> c = C()
>>> C.bar = bar
>>> c.bar()
Bork bork bork!
>>> c.foo()
Hi!
>>>
It also (as far as I know) makes the implementation of the python runtime easier.

I suggest that one should read Guido van Rossum's blog on this topic - Why explicit self has to stay.
When a method definition is decorated, we don't know whether to automatically give it a 'self' parameter or not: the decorator could turn the function into a static method (which has no 'self'), or a class method (which has a funny kind of self that refers to a class instead of an instance), or it could do something completely different (it's trivial to write a decorator that implements '#classmethod' or '#staticmethod' in pure Python). There's no way without knowing what the decorator does whether to endow the method being defined with an implicit 'self' argument or not.
I reject hacks like special-casing '#classmethod' and '#staticmethod'.

Python doesn't force you on using "self". You can give it whatever name you want. You just have to remember that the first argument in a method definition header is a reference to the object.

Also allows you to do this: (in short, invoking Outer(3).create_inner_class(4)().weird_sum_with_closure_scope(5) will return 12, but will do so in the craziest of ways.
class Outer(object):
def __init__(self, outer_num):
self.outer_num = outer_num
def create_inner_class(outer_self, inner_arg):
class Inner(object):
inner_arg = inner_arg
def weird_sum_with_closure_scope(inner_self, num)
return num + outer_self.outer_num + inner_arg
return Inner
Of course, this is harder to imagine in languages like Java and C#. By making the self reference explicit, you're free to refer to any object by that self reference. Also, such a way of playing with classes at runtime is harder to do in the more static languages - not that's it's necessarily good or bad. It's just that the explicit self allows all this craziness to exist.
Moreover, imagine this: We'd like to customize the behavior of methods (for profiling, or some crazy black magic). This can lead us to think: what if we had a class Method whose behavior we could override or control?
Well here it is:
from functools import partial
class MagicMethod(object):
"""Does black magic when called"""
def __get__(self, obj, obj_type):
# This binds the <other> class instance to the <innocent_self> parameter
# of the method MagicMethod.invoke
return partial(self.invoke, obj)
def invoke(magic_self, innocent_self, *args, **kwargs):
# do black magic here
...
print magic_self, innocent_self, args, kwargs
class InnocentClass(object):
magic_method = MagicMethod()
And now: InnocentClass().magic_method() will act like expected. The method will be bound with the innocent_self parameter to InnocentClass, and with the magic_self to the MagicMethod instance. Weird huh? It's like having 2 keywords this1 and this2 in languages like Java and C#. Magic like this allows frameworks to do stuff that would otherwise be much more verbose.
Again, I don't want to comment on the ethics of this stuff. I just wanted to show things that would be harder to do without an explicit self reference.

I think it has to do with PEP 227:
Names in class scope are not accessible. Names are resolved in the
innermost enclosing function scope. If a class definition occurs in a
chain of nested scopes, the resolution process skips class
definitions. This rule prevents odd interactions between class
attributes and local variable access. If a name binding operation
occurs in a class definition, it creates an attribute on the resulting
class object. To access this variable in a method, or in a function
nested within a method, an attribute reference must be used, either
via self or via the class name.

I think the real reason besides "The Zen of Python" is that Functions are first class citizens in Python.
Which essentially makes them an Object. Now The fundamental issue is if your functions are object as well then, in Object oriented paradigm how would you send messages to Objects when the messages themselves are objects ?
Looks like a chicken egg problem, to reduce this paradox, the only possible way is to either pass a context of execution to methods or detect it. But since python can have nested functions it would be impossible to do so as the context of execution would change for inner functions.
This means the only possible solution is to explicitly pass 'self' (The context of execution).
So i believe it is a implementation problem the Zen came much later.

As explained in self in Python, Demystified
anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit). This is the reason the first parameter of a function in class must be the object itself.
class Point(object):
def __init__(self,x = 0,y = 0):
self.x = x
self.y = y
def distance(self):
"""Find distance from origin"""
return (self.x**2 + self.y**2) ** 0.5
Invocations:
>>> p1 = Point(6,8)
>>> p1.distance()
10.0
init() defines three parameters but we just passed two (6 and 8). Similarly distance() requires one but zero arguments were passed.
Why is Python not complaining about this argument number mismatch?
Generally, when we call a method with some arguments, the corresponding class function is called by placing the method's object before the first argument. So, anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit).
This is the reason the first parameter of a function in class must be the object itself. Writing this parameter as self is merely a convention. It is not a keyword and has no special meaning in Python. We could use other names (like this) but I strongly suggest you not to. Using names other than self is frowned upon by most developers and degrades the readability of the code ("Readability counts").
...
In, the first example self.x is an instance attribute whereas x is a local variable. They are not the same and lie in different namespaces.
Self Is Here To Stay
Many have proposed to make self a keyword in Python, like this in C++ and Java. This would eliminate the redundant use of explicit self from the formal parameter list in methods. While this idea seems promising, it's not going to happen. At least not in the near future. The main reason is backward compatibility. Here is a blog from the creator of Python himself explaining why the explicit self has to stay.

The 'self' parameter keeps the current calling object.
class class_name:
class_variable
def method_name(self,arg):
self.var=arg
obj=class_name()
obj.method_name()
here, the self argument holds the object obj. Hence, the statement self.var denotes obj.var

There is also another very simple answer: according to the zen of python, "explicit is better than implicit".

How to keep python project modular?

Context
I've been working on a python project recently, and found modularity very important. For example you made a class with some attributes and some line of code that uses those attributes like
a = A()
print("hi"+a.imA)
If you were to modify imA of class A to another type, you would have to modify the print statement. In my case I had to do this so many times. It was annoying and time consuming. get/set methods would've solved this, but I heard that get/set are not 'good python'. So how would you solve this problem without using get and set methods?

First point: you would have saved yourself quite some hassle by using string formatting instead of string concatenation, ie:
print("hi {}".format(a.imA))
Granted, the final result may or not be what you'd expect depending on how a.imA type implements __str__() and __repr__() but at least this will not break the code.
wrt/ getters and setters, they are indeed considered rather unpythonic, because python has a strong support for computed attributes, and a simple generic implementation is available as the builtin property type.
NB: actually what's considered unpythonic is to systematically use implementation attributes and getters/setters (either explicits or - as is the case with computed attributes - implicits) when a plain public attribute is enough, and this is considered unpythonic because you can always turn a plain attribute into a computed one without breaking the client code (assuming of course you don't change the type nor semantic of the attribute) - something that was not possible with early OOPLs like Smalltalk, C++ or Java (Smalltalk being a bit of a special case actually but that's another topic).
In your case, if the point was to change the stored value's type without breaking the API, the simple obvious canonical solution was to use a property delegating to an implementation attribute:
before:
class Foo(object):
def __init__(self, bar):
# `bar` is expected to be the string representation of an int.
self.bar = bar
def frobnicate(self, val):
return (int(self.bar) + val) / 2
after:
class Foo(object):
def __init__(self, bar):
# `bar` is expected to be the string representation of an int.
self.bar = bar
# but we want to store it as an int
#property
def bar(self):
return str(self._bar)
#bar.setter
def bar(self, value):
self._bar = int(value)
def frobnicate(self, val):
# internally we use the implementation attribute `_bar`
return (self._bar + val) / 2
And you now have the value stored internally as an int, but the public interface is (almost) exactly the same - the only difference being that passing something that cannot be passed to int() will raise at the expected place (when you set it) instead than breaking at the most unexpected one (when you call .frobnicate())
Now note that that changing the type of a public attribute is just like changing the return type of a getter (or the type of a setter argument) - in both cases you are breaking the contract - so if what you wanted was really to change the type of A.imA, neither getters nor properties would have solved your issue - getters and setters (or in Python computed attributes) can only protect you from implementation changes.
EDIT: oh and yes: this has nothing to do with modularity (which is about writing decoupled, self-contained code that's easier to read, test, maintain and eventually reuse), but with encapsulation (which aim is to make the public interface resilient to implementation changes).

First, use
print(f"hi {a.imA}") # Python 3.6+
or
print("hi {}".format(a.imA)) # all Python 3
instead of
print("hi"+a.imA)
That way, str will be called automatically on each argument.
Then define a __str__ function in all your classes, so that printing any class always works.
class A:
def __init__(self):
self._member_1 = "spam"
def __str__(self):
return f"A(member 1: {self._member_1})"

Python builtin functions aren't really functions, right?

Was just thinking about Python's dict "function" and starting to realize that dict isn't really a function at all. For example, if we do dir(dict), we get all sorts of methods that aren't include in the usual namespace of an user defined function. Extending that thought, its similar to dir(list) and dir(len). They aren't function, but really types. But then I'm confused about the documentation page, http://docs.python.org/2/library/functions.html, which clearly says functions. (I guess it should really just says builtin callables)
So what gives? (Starting to seem that making the distinction of classes and functions is trivial)

It's a callable, as are classes in general. Calling dict() is effectively to call the dict constructor. It is like when you define your own class (C, say) and you call C() to instantiate it.

One way that dict is special, compared to, say, sum, is that though both are callable, and both are implemented in C (in cpython, anyway), dict is a type; that is, isinstance(dict, type) == True. This means that you can use dict as the base class for other types, you can write:
class MyDictSubclass(dict):
pass
but not
class MySumSubclass(sum):
pass
This can be useful to make classes that behave almost like a builtin object, but with some enhancements. For instance, you can define a subclass of tuple that implements + as vector addition instead of concatenation:
class Vector(tuple):
def __add__(self, other):
return Vector(x + y for x, y in zip(self, other))
Which brings up another interesting point. type is also implemented in C. It's also callable. Like dict (and unlike sum) it's an instance of type; isinstance(type, type) == True. Because of this weird, seemingly impossible cycle, type can be used to make new classes of classes, (called metaclasses). You can write:
class MyTypeSubclass(type):
pass
class MyClass(object):
__metaclass__ = MyTypeSubclass
or, in Python 3:
class MyClass(metaclass=MyTypeSubclass):
pass
Which give the interesting result that isinstance(MyClass, MyTypeSubclass) == True. How this is useful is a bit beyond the scope of this answer, though.

dict() is a constructor for a dict instance. When you do dir(dict) you're looking at the attributes of class dict. When you write a = dict() you're setting a to a new instance of type dict.
I'm assuming here that dict() is what you're referring to as the "dict function". Or are you calling an indexed instance of dict, e.g. a['my_key'] a function?

Note that calling dir on the constructor dict.__init__
dir(dict.__init__)
gives you what you would expect, including the same stuff as you'd get for any other function. Since a call to the dict() constructor results in a call to dict.__init__(instance), that explains where those function attributes went. (Of course there's a little extra behind-the-scenes work in any constructor, but that's the same for dicts as for any object.)

is it ever useful to define a class method with a reference to self not called 'self' in Python?

I'm teaching myself Python and I see the following in Dive into Python section 5.3:
By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
Considering that self is not a Python keyword, I'm guessing that it can sometimes be useful to use something else. Are there any such cases? If not, why is it not a keyword?

No, unless you want to confuse every other programmer that looks at your code after you write it. self is not a keyword because it is an identifier. It could have been a keyword and the fact that it isn't one was a design decision.

As a side observation, note that Pilgrim is committing a common misuse of terms here: a class method is quite a different thing from an instance method, which is what he's talking about here. As wikipedia puts it, "a method is a subroutine that is exclusively associated either with a class (in which case it is called a class method or a static method) or with an object (in which case it is an instance method).". Python's built-ins include a staticmethod type, to make static methods, and a classmethod type, to make class methods, each generally used as a decorator; if you don't use either, a def in a class body makes an instance method. E.g.:
>>> class X(object):
... def noclass(self): print self
... #classmethod
... def withclass(cls): print cls
...
>>> x = X()
>>> x.noclass()
<__main__.X object at 0x698d0>
>>> x.withclass()
<class '__main__.X'>
>>>
As you see, the instance method noclass gets the instance as its argument, but the class method withclass gets the class instead.
So it would be extremely confusing and misleading to use self as the name of the first parameter of a class method: the convention in this case is instead to use cls, as in my example above. While this IS just a convention, there is no real good reason for violating it -- any more than there would be, say, for naming a variable number_of_cats if the purpose of the variable is counting dogs!-)

The only case of this I've seen is when you define a function outside of a class definition, and then assign it to the class, e.g.:
class Foo(object):
def bar(self):
# Do something with 'self'
def baz(inst):
return inst.bar()
Foo.baz = baz
In this case, self is a little strange to use, because the function could be applied to many classes. Most often I've seen inst or cls used instead.

I once had some code like (and I apologize for lack of creativity in the example):
class Animal:
def __init__(self, volume=1):
self.volume = volume
self.description = "Animal"
def Sound(self):
pass
def GetADog(self, newvolume):
class Dog(Animal):
def Sound(this):
return self.description + ": " + ("woof" * this.volume)
return Dog(newvolume)
Then we have output like:
>>> a = Animal(3)
>>> d = a.GetADog(2)
>>> d.Sound()
'Animal: woofwoof'
I wasn't sure if self within the Dog class would shadow self within the Animal class, so I opted to make Dog's reference the word "this" instead. In my opinion and for that particular application, that was more clear to me.

Because it is a convention, not language syntax. There is a Python style guide that people who program in Python follow. This way libraries have a familiar look and feel. Python places a lot of emphasis on readability, and consistency is an important part of this.

I think that the main reason self is used by convention rather than being a Python keyword is because it's simpler to have all methods/functions take arguments the same way rather than having to put together different argument forms for functions, class methods, instance methods, etc.
Note that if you have an actual class method (i.e. one defined using the classmethod decorator), the convention is to use "cls" instead of "self".

Python introspection: How to get an 'unsorted' list of object attributes?

The following code
import types
class A:
class D:
pass
class C:
pass
for d in dir(A):
if type(eval('A.'+d)) is types.ClassType:
print d
outputs
C
D
How do I get it to output in the order in which these classes were defined in the code? I.e.
D
C
Is there any way other than using inspect.getsource(A) and parsing that?

Note that that parsing is already done for you in inspect - take a look at inspect.findsource, which searches the module for the class definition and returns the source and line number. Sorting on that line number (you may also need to split out classes defined in separate modules) should give the right order.
However, this function doesn't seem to be documented, and is just using a regular expression to find the line, so it may not be too reliable.
Another option is to use metaclasses, or some other way to either implicitly or explicitly ordering information to the object. For example:
import itertools, operator
next_id = itertools.count().next
class OrderedMeta(type):
def __init__(cls, name, bases, dct):
super(OrderedMeta, cls).__init__(name, bases, dct)
cls._order = next_id()
# Set the default metaclass
__metaclass__ = OrderedMeta
class A:
class D:
pass
class C:
pass
print sorted([cls for cls in [getattr(A, name) for name in dir(A)]
if isinstance(cls, OrderedMeta)], key=operator.attrgetter("_order"))
However this is a fairly intrusive change (requires setting the metaclass of any classes you're interested in to OrderedMeta)

The inspect module also has the findsource function. It returns a tuple of source lines and line number where the object is defined.
>>> import inspect
>>> import StringIO
>>> inspect.findsource(StringIO.StringIO)[1]
41
>>>
The findsource function actually searches trough the source file and looks for likely candidates if it is given a class-object.
Given a method-, function-, traceback-, frame-, or code-object, it simply looks at the co_firstlineno attribute of the (contained) code-object.

No, you can't get those attributes in the order you're looking for. Python attributes are stored in a dict (read: hashmap), which has no awareness of insertion order.
Also, I would avoid the use of eval by simply saying
if type(getattr(A, d)) is types.ClassType:
print d
in your loop. Note that you can also just iterate through key/value pairs in A.__dict__

AFAIK, no -- there isn't*. This is because all of a class's attributes are stored in a dictionary (which is, as you know, unordered).
*: it might actually be possible, but that would require either decorators or possibly metaclass hacking. Do either of those interest you?

class ExampleObject:
def example2():
pass
def example1():
pass
context = ExampleObject
def sort_key(item):
return inspect.findsource(item)[1]
properties = [
getattr(context, attribute) for attribute in dir(context)
if callable(getattr(context, attribute)) and
attribute.startswith('__') is False
]
properties.sort(key=sort_key)
print(properties)
Should print out:
[<function ExampleObject.example2 at 0x7fc2baf9e940>, <function ExampleObject.example1 at 0x7fc2bae5e790>]
I needed to use this as well for some compiler i'm building, and this proved very useful.

I'm not trying to be glib here, but would it be feasible for you to organize the classes in your source alphabetically? i find that when there are lots of classes in one file this can be useful in its own right.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python repr for classes - python

Since neither "<class 'A'>" nor "A" can be used to re-create the class when its definition is not available, I think the question is moot.

Related

Invisible argument python [duplicate]

How to keep python project modular?

Python builtin functions aren't really functions, right?

is it ever useful to define a class method with a reference to self not called 'self' in Python?

Python introspection: How to get an 'unsorted' list of object attributes?

Categories

Resources